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Abstract 

We demonstrate in this paper that the probabilities for sequential mea- 
surements have features very different from those of single-time measure- 
ments. First, they cannot be modeled by a classical stochastic process. 
Second, they are contextual, namely they depend strongly on the specific 
measurement scheme through which they are determined. We construct 
Positive-Operator- Valued measures (POVM) that provide such probabil- 
ities. For observables with continuous spectrum, the constructed POVMs 
depend strongly on the resolution of the measurement device, a conclu- 
sion that persists even if we consider a quantum mechanical measurement 
device or the presence of an environment. We then examine the same 
issues in alternative interpretations of quantum theory. We first show 
that multi-time probabilities cannot be naturally defined in terms of a 
frequency operator. We next prove that local hidden variable theories 
cannot reproduce the predictions of quantum theory for sequential mea- 
surements, even when the degrees of freedom of the measuring apparatus 
are taken into account. Bohmian mechanics, however, does not fall in 
this category. We finally examine an alternative proposal that sequen- 
tial measurements can be modeled by a process that does not satisfy the 
Kolmogorov axioms of probability. This removes contextuality without 
introducing non-locality, but implies that the empirical probabilities can- 
not be always defined (the event frequencies do not converge). We argue 
that the predictions of this hypothesis are not ruled out by existing exper- 
imental results (examining in particular the "which way" experiments); 
they are, however, distinguishable in principle. 

'Email: anastop@physics.upatras.gr 
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1 Introduction 



1.1 The main theme 

The topic of this paper is sequential quantum measurements and their proba- 
bilistic description. We show that the construction of probabilities for sequential 
measurements is rather intricate and in some aspects differs strongly from its 
analogues in classical probability theory. For example, quantum multi-time 
probabilities do not define a stochastic process. We can isolate specific points 
of divergence between quantum and classical probability theory (including hid- 
den variable theories in the latter) and to argue that these differences can be 
empirically determined, at least in principle. 

The motivation for this line of inquiry is two- fold. First, the determination 
of probabilities in sequential measurements is of interest on its own right. It 
seems experimentally feasible, as it is nowadays possible to construct sources 
that emit individual systems. However, the construction of such probabilities 
from the rules of standard quantum theory is not as straightforward as it may 
seem, for the relevant probabilities can not be obtained in a natural way from the 
Hilbert space geometry. Assumptions about the physical implementation of the 
measurement process are needed, and these touch inevitably upon fundamental 
interpretational issues. 

An immediate result of our analysis is that multi-time probabilities are 
strongly dependent upon the specific experimental set-up used in their determi- 
nation. For observables corresponding to operators with discrete spectrum, one 
may construct a probability distribution rather simply. The same procedure ap- 
plied to observables with continuous spectrum leads to probabilities that depend 
very strongly on an additional parameter <5. This parameter can be interpreted 
as the resolution of the measurement device, but the dependence of the result- 
ing probabilities is so strong as to be highly counter-intuitive. This dependence 
persists even for samplings coarse-grained at a scale much larger than S. An 
interesting corollary of this analysis is that it is impossible to simulate by a 
stochastic process the probabilities obtained from sequential measurements of a 
quantum system. 

The other motivation for this research is related to basic interpretational 
issues of quantum theory. Probabilities arc introduced in the quantum me- 
chanical formalism through Born's interpretation of the wave function. Born's 
rule is valid for single-time measurement of one observable (or for a family of 
compatible observables). In that case, quantum theory is reduced to a descrip- 
tion in terms of classical probabilistic concepts, which describe successfully the 
statistical outcomes of experiments. 

But once one moves away from this context, the coexistence between quan- 
tum theory and classical probability theory becomes less harmonious. This 
is highlighted by three representative theorems: Bell's, Wigner's and Kochen- 
Spcckcr's [1, 3]. 
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The violation of Bell's inequalities (and their generalisations) implies that 
local hidden variables theories are ruled out by experiment. This may imply 
either quantum non-locality, or that it is impossible to define a sample space 
for a physical system in itself, without referring to the specific experiment that 
is carried out. The latter property is referred to as contextuality of quantum 
properties (or measurements). Wigner's theorem is a representative of a more 
general result: it is not possible to define a joint probability distribution for 
variables that correspond to non-commuting operators. This can be argued to 
be a form of contextuality, in the sense that there does not exist a universal 
sample space to describe the outcomes of all possible measurements that can be 
performed in an ensemble of quantum systems. The Kochen-Specker theorem 
demonstrates a stronger form of contextuality: it is impossible to assign definite 
values to a physical observable without referring to the commuting set that is 
measured along with it. 

While all three theorems above suggest that quantum mechanical properties 
(and consequently probabilities) are contextual, they do not easily relate to em- 
pirical evidence. The observed violation of Bell's inequalities may be attributed 
to non-locality rather than contextuality, the measurement of incompatible ob- 
servables involves distinct experimental situations, whose outcomes cannot be 
immediately compared, while the Kochen-Specker theorem refers to idealized 
values of observables that a physical system possesses prior to measurement 
(hence empirically inaccessible). 

Sequential measurements on the other hand provide a ground, on which the 
idea of contextuality can be explicitly tested. The application of the rules of 
standard quantum theory suggests that two different measurement schemes will 
give rise to different value for the probability of the same property of a physical 
system, even if the initial state is assumed to be the same. Hence the precise 
statistical study of the outcomes in sequential measurements may in principle 
reveal unambiguously the contextual character of quantum probability. 

The problem is that we obtain much more contextuality than we bargained 
for. Not only are multi-time probabilities dependent on the measurement scheme 
through which they are determined, but they seem to depend strongly on rather 
trivial details of the measurement device. This is unavoidable, at least if we 
do not abandon the usual rules of quantum theory. It is then questionable 
whether it is possible to properly define a statistical ensemble for sequential 
measurements, or even if any physical information can be extracted from them. 

This rather disturbing feature of multi-time probabilities provides the moti- 
vation to seek an alternative account. We first consider hidden variable theories. 
We prove that any local hidden variable theory (deterministic or stochastic) 
that reproduces the single-time probabilities of quantum theory cannot repro- 
duce those for multi-time probabilities. The only way to do so is by assuming a 
non-local interaction between system and measuring device, similar to the one 
appearing in Bohmian mechanics. 

The other alternative we examine here is related to proposals [4, 5]-see also 
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[6]-that it might be possible to avoid contextuality (the constraints of Bell's 
and Kochen-Specker's theorem) by assuming that quantum theory is described 
by a "probability" measure that does not satisfy the Kolmogorov axioms-in 
particular the additivity property. While a non-additive measure is mathemati- 
cally natural in multi-time probabilities, its physical interpretation is somewhat 
problematic. A non-additive measure cannot be interpreted in terms of any 
empirical probabilities, which are obtained by the limit of event frequencies. It 
only make sense if one assumes that the event frequencies for sequential measure- 
ments do not converge to probabilities. We explore further this idea, showing 
that it is consistent with usual treatment of probabilities in quantum theory, 
that it is natural from an operational point of view and that it is in principle 
distinguishable from any alternative that assumes that empirical probabilities 
for sequential measurements always exist. 

1.2 The structure of this paper 

The paper is structured as follows. 

In Section 2 we briefly review classical and quantum probability theory, in 
order to set-up our conventions. We also provide some preliminary mathe- 
matical arguments about the inequivalence between the classical and quantum 
descriptions of sequential measurements. 

Section 3 contains the central results of this paper. First, we motivate the 
discussion on probabilities of sequential measurements, focusing in particular on 
the fact that the quantum mechanical correlation functions are complex-valued 
and have no immediate correspondence in terms of objects that can be immedi- 
ately determined. Then we demonstrate that quantum logic cannot be expected 
to hold in sequential measurements. This is unlike single-time measurements for 
which the spectral theorem together with Born's rule guarantee that different 
measurement schemes lead to the same probability assignment (assuming iden- 
tical preparation). Multi-time probabilities are therefore highly contextual. We 
then discuss the description of multi-time probabilities via Positive-Operator- 
Valued-Measures (POVMs). We prove two theorems that demonstrate that it is 
not possible to construct POVMs for sequential measurements compatible with 
the single-time predictions of quantum theory. These results provide a general 
proof of an often quoted statement that quantum mechanical probabilities can- 
not be simulated by stochastic processes. We then demonstrate different ways 
of constructing POVMs for a specific class of multi-time measurements of po- 
sition. These POVMs exhibit a very strong dependence on properties of the 
measurement device (its resolution) that persist even in highly coarse-grained 
samplings. Finally we show that neither the consideration of a fully quantum 
measuring device or of decoherence due to the environment affect significantly 
these conclusions. 

In section 4 we discuss other interpretational schemes, most notably hidden 
variable theories and we demonstrate that the predictions of quantum theory 
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for sequential measurements are not compatible with the assumption of local 
interactions between measured system and measuring device. Finally in Section 
5 we consider the alternative proposal that probabilities for sequential mea- 
surements cannot be defined because the relative frequencies do not converge. 
The motivation for this proposal is analysed in detail. We then demonstrate 
that it is compatible with the predictions of single time quantum theory, that 
it is not contradicted by some well-established results and that it is possible to 
distinguish it unambiguously even in very simple experimental set-ups. 

2 Classical Vs quantum probability 

2.1 Basic facts 

We briefly describe here the mathematical structure of classical and quantum 
probability in order to set-up our notations, conventions and terminology for 
later use. 

2.1.1 Classical probability theory 

In classical probability one assumes that all possible elementary alternatives lie 
in a space fi, the sample space. Observables are functions on fi, and are usually 
called random variables. The outcome of any measurement can be phrased as 
a statement that the system is found in a given subset C of fi. Hence the set of 
certain well-behaved (measurable) subsets of fi is identified with the set of all 
coarse-grained alternatives of the system. To each subset C, there corresponds 
an observable Xc(x), the characteristic function of the set C. It is defined as 
Xc( x ) = 1 if x € C and Xc{x) = otherwise. It is customary to denote the 
characteristic function of fi as 1 and of the empty set as 0. 
If an observable / takes values fi in subsets C, of fi 



A state is intuitively thought of as a preparation of a system. Mathematically 
it is represented by a measure on fi, i.c a map that assigns to each alternative 
C a probability p(C). A probability measure satisfies the Kolmogorov conditions 

- for all subsets C of fi, < p(C) < 1 
-p(0) = 0;p(l) = l. 

- for all disjoint subsets C and D of u, p(C U D) = p(C) + p{D) 

Due to (2. 1) one can define p(f) = fip(Ci); p(f) is the mean value of 
/. In the case that fi is a subset of R™, the probability measures are defined in 
terms of a probability distribution, i.e. a positive function on fi, which we shall 




(2. 1) 



I 
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denote as p(x). 



p(f) = / dxp(x)f(x) 



2.1.2 Quantum probability theory 

The formalism of quantum mechanics incorporates probability through Born's 
rule, which in its initial form asserts that the square modulus |-0(a;)| 2 of Schrodingcr's 
wave function can be interpreted as a probability density for the particle's po- 
sition. In the abstract Hilbert space formulation Born's interpretation can be 
implemented through the spectral theorem: under rather general conditions we 
may assign a Projection- Valued-Measure (PVM) dE(\) to each self-adjoint op- 
erator A. The PVM is a map assigning to each measurable set U of A's spectrum 
ct(A) a projection operator E(U) — j lJ dE(X), such that E(U) = xu{A), where 
Xu is the characteristic function of U . The projectors in the range of the PVM 
reflect the Boolean algebra of the subsets of <r(A in the sense that 

-E(Q) = 0, %(i)) = i, 

- E(U U V) = E(U) + E(V), unv = ®, 

-E(UnV) =E(U)E(V). 

The spectral theorem implies that the Hilbert space H is isomorphic to that 
of square- integrable functions over o~(A), and as such the Born rule may be 
directly applied: the probability for an event corresponding to U C <r(A) is 

p(U) =TrpE{U). (2. 3) 

Given that A = J XdE(X) the standard relation between probabilities and ex- 
pectation values can be established. 

It follows that for single-time measurements of a single observable (or of 
many observables represented by mutually commuting operators) quantum the- 
ory via the Born rule is completely equivalent to classical probability theory. 

The Copenhagen interpretation employs the formalism of quantum theory to 
account for the outcomes of specific experiments. It presupposes a split between 
the measured system, which is fully quantum, and the measuring apparatus, 
which is part of the classical world. While this creates the key problem of 
explaining the classical description of an object that consists of fundamentally 
quantum entities, it is fully self-consistent at an operational level, namely if 
we only employ quantum theory to account for the statistics of measurement 
outcomes. 

We shall adopt an operational stance in most discussions in this paper. The 
reason for this choice is that the operational description is a core of quantum 
theory that refers immediately to the concrete experimental situations, and the 
remarkable success of quantum theory implies that all contending interpretation 
must accept it, either as a fundamental or as an emergent theory. Still, we shall 
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find it necessary in the course of the argument to move beyond the operational 
description and consider quantum measurement theory, namely the assumption 
that the measuring apparatus is fully or partly quantum mechanical. 

2.1.3 Probabilities and event frequencies 

To apply a specific version of probability theory in a concrete physical system, 
one needs to be able to relate the numbers obtained by the mathematical for- 
malism to the concrete experimental data. This relation is achieved by the 
correspondence of probability to relative frequencies of events in statistical en- 
sembles. While it can be argued that relative frequencies do not exhaust the 
physical content of probability theory, that the latter can be interpreted in a way 
that refers to individual systems and not only statistical ensembles, and even 
that a definition of probabilities from frequencies is highly problematic, any 
sharp quantitative test of a probabilistic theory involves a comparison of the- 
oretical probabilities to empirical probabilities, which are obtained from event 
frequencies. 

Suppose for simplicity that the sample space of our system f2 = R. We 
assume an experiment that determines a value for x <G R. Repeating the ex- 
periment n times we obtain a sequence Xi, i = 1, . . . , n of measured values. We 
may then consider the relative frequency for the proposition that the variable x 
took value in the subset U C R. If xu is the characteristic function of the set 
U 1 , we define the relative frequency for the occurrence of an event in U for the 
first n experimental runs 

v n {U) = -Y,Xu{xi). (2.4) 
%—\ 

The probability p(U) associated to the event U is the limit 

p(U) = lim v n {U), (2. 5) 

n— >oo 

assuming of course that it exists. 

Since any actual determination of probabilities involves a finite number of 
runs, we can never establish the convergence of frequencies. If, however, the 
description of the physical systems in terms of probabilities is valid, one expects 
that the relative rate of convergence e„ = ^^^^y^ ~ ^ by virtue of the 

central limit theorem. Hence the fall-off of e„ for large n as n" 1 / 2 is a good 
indication of convergence for the relative frequencies. 

The mean value of a random variable f(x) is similarly identified as 

(/)= lim - j^fix,). (2. 6) 

n— »oo ft z — ' 

i—1 

1 We assume that U is a sufficiently well-behaved set (like an open set) so that there is no 
operational problem in ascertaining that x £ U. 
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2.2 The analogue of stochastic processes 

We saw that for the measurements of a single observable at a single moment 
of time the predictions of quantum theory are fully compatible with those of 
classical probability. It is then natural inquire whether this correspondence 
passes through when one considers the values of a single observable at more 
than one moments of time. 

In multi-time measurements the law of time evolution enters explicitly. Let 
us assume a classical probabilistic system with sample space 51 = R, whose 
probability density evolves in time according to 

lp = C P , (2.7) 

where £ is a positive, norm-preserving operator. The formal solution of this 
equation is 

Pt = e ct Po , (2. 8) 

which can be written in terms of the integral kernel g t (x,x') of e ct 

Pt(x) = J dx'g t {x,x')p {x'). (2. 9) 

One may then define a probability measure d(j,[x(-)] on the space of paths 
on ft as a suitable limit of the expression 

dp(x tl ,x t2 ,. ..,x tn ) = po(x )gt 1 {xQ,xi)g t2 ^t 1 (xi,x 2 ) ■■■ 

g tn -t n _ 1 (x n -i,x n ),dx dxidx2...dx n , (2. 10) 

which is defined on discrete-time paths. 

The reason it is possible to extend the single-time probability to a stochastic 
probability measure is that the evolution law is linear with respect to the proba- 
bility density. In quantum theory this is not the case; the evolution law is linear 
with respect to the wave function and not the probability density. Hamiltonian 
evolution mixes the diagonal elements of the density matrix (which correspond 
to probabilities) with the off-diagonal ones (which have no such interpretation). 

It seems therefore not straightforward (if at all possible) to extend the single- 
time probabilistic description of quantum theory to a stochastic process. This 
conclusion will be verified by a more rigorous analysis in section 3.3. Nonethe- 
less, we can write stochastic processes that reproduce some of quantum theory's 
predictions. One such example is Nelson's stochastic mechanics [7, 8], which 
introduces a stochastic differential equation on configuration space that can re- 
produce the expectation values of the position observable at every moment of 
time. 
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2.3 The history formalism 

The underlying reason that quantum evolution cannot be described by a stochas- 
tic process is that the mathematically natural measure on histories (or paths) 
does not satisfy the Kolmogorov axioms of probability theory. This is par- 
ticularly highlighted in the consistent histories approach to quantum theory 
[9, 10, 11, 12]. 

The basic object of this formalism is a history, namely a time-ordered se- 
quence of projection operators Pt 1 , . . . , P\ n , and it corresponds to a time-ordered 
sequence of propositions about the physical system. The indices t\, . . . ,t n refer 
to the time a proposition is asserted and have no dynamical meaning. Dynam- 
ics are related to the Hamiltonian H , which defines the one-parameter group 
of unitary operators U(s) = e~ lHs . In the consistent histories approach a his- 
tory is thought to correspond to propositions about the physical system, not 
necessarily associated to acts of measurement. Consistent histories is a general- 
isation of Copenhagen quantum theory aiming to provide a quantum mechanical 
description of individual systems. 

The quantum rule for conditional probability is that if the property corre- 
sponding to the projector P x is realized then we may encode the information 
obtained in a change of the density matrix 2 

' AM (2. 11) 



Tr(pP x 



hence the conditional probability the P 2 will be realized at time ti given that 
Pi was realized at ti equals 

Tr (p 2 e- i H( t 2- t i) p ie -iHti pgiHtt p l6 iHt 1 p 2e iH(t 2 -t 1 )\ 

^ = ^ J -, (2. 12) 

leading to a probability for the joint realisation of Pi at ti and P 2 at t 2 

Tr ^P 2 e- l6 ^- t ^Pie- 1 " tl pe^ tl Pie i6tl P 2 e^^- t1 ^ (2. 13) 

For a general n-time history a = {P tl , P t2 , . . . , P tn } this results generalizes 
a s follows. We define the class operator C a defined by 

C a = U\t n )P tn U(t n )...U\ti)P tl U{ti), (2. 14) 

which leads to a probability measure 

p(a) = Tr (d a pCl) . (2. 15) 

2 We shall argue later that this rule cannot be applied freely, at least as far as measurements 
are concerned. 
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These probabilities do not define a genuine measure on the space of histo- 
ries. To see this, we consider two histories a = {P tl , Pt 2 , ■ ■ ■ , Pt n } and (3 = 
{P tl ,P t2 ,...,P t J, such that P tl P ti = 0, the history {P tl + P ti , P t2 , . . . , P tn } is 
the logical join a V (3 of the histories a and (3. The probabilities, however, do 
not satisfy the additivity condition 

p(a V 0) = p(a) +p(j3). (2. 16) 

If the histories are interpreted as referring to measurements the failure of 
the additivity condition is not (at first sight) a problem, because each history 
corresponds to a different sequence of YES-NO experiment and there is no a 
priori reason, why all different experiments should be modeled by a common 
probability measure. However, if histories are thought to correspond to prop- 
erties of individual system then the lack of a probability measure becomes a 
problem. 

In the consistent histories approach this is taken into account as follows. 
We define the decoherence functional as a complex-valued function of pairs of 
histories: i.e. a map d : V x V — > C. For two histories a and a 1 it is given by 

d(a,a')=Tr(d a p Cl). (2.17) 

The consistent histories interpretation of this object is that when d(a,a') = 
for a =^ a' in an exhaustive and exclusive set of histories 3 , then one may assign 
a probability distribution to this set as p(a) — d(a, a). The value of d(a, (3) is, 
therefore, a measure of the degree of interference between the histories a and 
(3. 

We end this section with a remark. We shall employ many mathemati- 
cal objects appearing in the consistent histories approach throughout this pa- 
per (without a change in name or notation). The reader should keep in mind 
that the focus of this paper is the description of measurement outcomes through 
the rules of standard quantum theory, hence the context and interpretation of 
these objects are different from those in consistent histories. 

3 Sequential measurements in standard quan- 
tum theory 

3.1 Multi-time correlation functions 

In classical probability theory there is no conceptual distinction between single- 
time and multi-time measurements of a physical system. If the sample space for 

3 By exhaustive we mean that at each moment of time tj, ^] . at i = 1 and by exclusive 

that atiPti = & a p- Note that by a wc denote the proposition with the corresponding projector 
written as a with a hat. 
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the single-time measurement is f2, the sample space for n-time measurements 
as a Cartesian product x n Q„: the outcome of n measurements of an observable 
x is an ordered n-tuple of values of x. In general, one may define an sample 
space ft T of all paths from a time interval T — [0, t] to Vl and a corresponding 
stochastic measure d/j,[x(-)]. One then immediately transfers the interpretation 
of probabilities in terms of relative frequencies and reconstructs the statistical 
behavior of any observable on the multi-time sample space. 

The probabilities for the measurements of an observable x is most conve- 
niently incorporated in the (unequal-time) correlation functions of an observable 

/(*), 

(ftjt 2 -..ftj= J d^[x(-)]F tl [x(-)]FM)]...F tn [x{-)h (3- 1) 

in terms of the functions F on fl defined by 

F t [x(-)] = f(x(t)). (3. 2) 

From an operational point of view, there is no problem in measuring multi- 
time probabilities or correlation functions, as long as the corresponding single- 
time measurements do not destroy the physical system. The same is true for 
quantum mechanical systems: we may consider for example a succession of 
Stern-Gerlach devices, or microscopic particles leaving their trace in sharply 
localized layers of recording material (we shall elaborate on such experiments 
later). We therefore expect that quantum mechanics should allow us to de- 
termine the values of the correlation functions, which can be unambiguously 
determined from experiment. 

The objects we usually call correlation functions in quantum theory are 
expectation values of products of operators, such that 

(x tl x t2 ) = Me'^xe'Bto-^xe-' 6 *^). (3. 3) 

These "correlation functions" are in general complex- valued, and for this reason 
they have no interpretation in terms of the statistics of measurement outcomes. 
Clearly, the construction and interpretation of multi-time quantum probabilities 
involves many more subtleties than their analogue in the single-time case. 
In light of the discussion above, there are two questions that must be raised. 

-First, what is the physical meaning of the mathematically natural complex- 
valued correlation functions? 

-Second, how can we employ the standard quantum mechanical formalism (or 
slight generalisations thereof) to construct real- valued correlation functions that 
would describe the statistics of multi-time measurements? 

Before proceeding to address these questions let us comment on a rather naive 
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answer that can be given to the second one: the physically relevant correla- 
tion functions can be obtained by elementary algebraic manipulations on the 
complex-valued ones, taking for example their real part, or their totally sym- 
metrizes version etc. The immediate objection is that any such choice is com- 
pletely ad hoc with no justification in terms of the usual principles of quantum 
theory. Why should we choose the real part of the correlation function, rather 
than the imaginary part, or their modulus? But even if we decide by fiat that a 
specific answer is the correct one, the problem persists at the level of the prob- 
abilistic interpretations: correlation functions must be related to probabilities 
for sequential measurements. Hence any determination of correlation functions 
must deal with the problem that the mathematically natural probability mea- 
sure for histories is non-additive. 

3.2 Sequential measurements and quantum logic 

We now examine the definition of probabilities for multi-time measurements. 
There is a substantial literature on this topic-see for example [13, 14, 15, 16, 
17, 18, 19, 20, 21, 22]; indeed discussion of this issue can be traced back to 
the early days of quantum mechanics. Our presentation here aims to highlight 
the specific quantum mechanical features through comparison with analogous 
'experiments' in classical probability. 

For ideal measurements, one may employ equation (2. 15) for the probabil- 
ities. This expression defines a non-additive measure on the space of histories 
(we use the word histories heuristically here to denote a temporal succession of 
measurement outcomes). On the other hand, any empirical probability that is 
constructed by event frequencies should satisfy the additivity condition. This 
is an apparent contradiction. 

As a first step towards an answer we shall elaborate on specific features of 
single-time measurements. In any well-designed experiment, we need to guar- 
antee that the results do not depend too strongly on specific details the mea- 
surement device. The reasons for that are epistemological (experiments must 
be reproducible) but also practical: minor details of the measurement device 
should not affect the experimental outcomes significantly. They should ideally 
be hidden within the sampling or systematic errors of the experiment. More- 
over, it would be highly desirable if different measurement schemes for the same 
observable and with the same preparation procedure should give compatible (if 
not identical) results. 

We may consider for example two different measurement schemes for the 
position of a particle. In the first, we assume a source emitting electrons with 
well defined momentum in the z-dircction, but with significant spread in the x 
and y directions 4 . At a specific distance from the source we place a photographic 
plate that records the electron's position. This set-up is equivalent to a single- 

4 For example, the z-degrees of freedom may be represented by the wave function ip{z) = 
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Position measurement 



YES-NO measurement 



□ 



photographic plate 

FigUrG 1: Single-time position measurement Vs single time filter measurement of positii 



time measurement of the electron's x and y coordinates. The distribution of 
electrons on the screen corresponds to a probability distribution, which modulo 
sampling errors is given by Born's rule. The number of electrons found in a 
subset U of the plate is proportional to p{U) — J v dxdy\ip(x, y)\ 2 . 

We may also consider a filter measurement of the electron's position, by 
placing instead of a photographic plate a curtain with a hole corresponding to 
the subset U. Any detector placed behind the whole will register a number 
of particles proportional to TrpP(U). This type of measurement is known as 
a YES-NO experiment, because it can only admit two answers: the particle 
passing through U or not. 

The important point is that the value for the probability p(U) in the exper- 
iment with the photographic plate coincides with that obtained from the YES- 
NO experiment. Moreover, if we carry a sufficiently large number of YES-NO 
experiments differing only in the position of the hole, we will obtain sufficient 
information to fully reconstruct the probability distribution of the first exper- 
iment. In other words, in single-time quantum theory, YES-NO experiments 
contain the full probabilistic information about a quantum system. The empir- 
ical probabilities for a sample set U are the same in all measurement schemes 
that correspond to the same preparation of the physical system, modulo sam- 
pling and systematic errors. This universality is often referred to as defining a 
logic for quantum measurements, quantum logic. It is in effect a consequence 
of the spectral theorem. 

We now return to the analysis of multi-time measurements. If the only 
possible multi-time experiment that could be carried out were of the YES-NO 
type, there would be no downright problem from the non-additivity of (2. 15), 
at least not a worse problem than appearing in any other quantum " paradox" . 

2 

j+ipzZ 

— — ^ 1/4 e 4 "z , such that the spread Ap z = l/a z « p z . This set-up corresponds to 

a measurement at a reasonably well-specified moment of time. 
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) 



detector 

□ 



Figure 2: A two-time YES-NO measurement of a particle's positions. 

To see this one may consider the following two-time YES-NO experiment 
measuring the position of a particle. We assume a source of electrons prepared 
in the same state as in the previous examples. At fixed distances from the source 
and parallel to the x-y plane we place two curtains with holes corresponding to 
the subsets U\ and U2 of the x — y plane. Behind the second slit we place a 
particle detector. 

Repeating the experiments above n times, we record the number of times the 
detector click, thus constructing the sequence of relative frequencies v n {U\,t\\ U2, t?>), 
and from it the corresponding probability p{U\,t-\_;U2,t2). To construct the 
probability p{U[,t\\ U2, £2)1 f° r a different slit corresponding to U[ we have to 
change the experimental configuration, and similarly for p(Ui U U[ , t\, U 2 , t 2 ). 
Hence the probabilities p(U\,ti; U 2 , t 2 ), p(U[,ti; U 2 , t 2 ) and p(U\ UU[, ti; ^2^2) 
do not refer to the same experimental set-up, and there is no contradiction 
between Eq. (2. 15) and the additive character of relative frequencies. 

However, YES-NO measurements are not the only one possible in practice. 
For measurements at a single moment of time they contain all the probabilis- 
tic information of quantum theory, but this does not hold for sequential mea- 
surements. To see this, let us consider the following scheme for a two-time 
measurement of position. We assume a particle source as before, which can be 
controlled so finely as to emit a single particle at a time. Two thin sheets of 
penetrable material are placed one after the other in front of the particle source, 
both parallel to the x-y plane. Particles leave tracks as they cross through the 
sheets, and one may then determine their x and y coordinates. 

Each time the source emits a particle we record the readings (xi, ti; x 2 , £2)™; 
n labels the experimental runs and the y coordinate is suppressed for brevity. 
We thus construct a sequence of measurement outcomes. From this one defines 
the sequence v n {Ui,t2\ ^2,^2) for each pair of subsets U\ of the sheet at ti and 
U 2 of the sheet at t 2 . One obtains the probability p(Ui,ti; U n ,t n ) as the limit 
v n {Ui,t\; U n ,t n ) asn^ oo-assuming it exists. 

Unlike YES-NO experiment the sequences v n (Ui,ti; U 2l t2) constructed for 
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Figure 3: A two-time measurement of a particle's position. The particle leaves a trace on the 
plates of penetrable material. One then samples the data into specific subsets of the plates, in order 
to construct the corresponding multi-time probabilities. 



different choices of the sample sets all refer to the same experimental set-up. 
They should therefore satisfy the additivity condition (modulo sampling and 
systematic errors) 

MUi,tilU2,t2) + v n (U[,t 1 ;U2,h) = v n {U 1 UUit 1 -,U2,t 2 ), (3. 4) 

since they refer to indivisible and specific measurement events. It follows that 
the probabilities (2. 15) do not describe the outcomes of this experiment. This 
conclusion holds for any multiple-time measurement, in which any possible al- 
ternative of the observable can be recorded at each moment of time, provided 
that the corresponding operator does not commute with the Hamiltonian. One 
could consider, for example, a succession of two Stern-Gerlach apparatuses, with 
different directions of their magnetic fields placed in such a position as to mea- 
sure the spin of the particle in the direction n at time t\ and in the direction n' 
at time t 2 . 

Note also that this thought-experiment presupposes that we record the trace 
of each particle individually on the sheets. It is, therefore, essential that in each 
individual run of the experiment the source emits only a single particle. If we 
perform this experiment with beams of particles, we will not have sufficient 
statistical information to construct the two-time probabilities. We would not 
be able to ascertain that the particle found recorded in x\ at time t\ is the same 
with the particle recorded in x 2 at time t 2 ■ The most we could obtain would be 
the two marginal probability distributions for the probability density at t± and 
the probability density at t 2 5 . 

5 Expcriments like that of Fig. 2 involve only a single act of detection. Hence, even if 
they are formally a two-time YES-NO experiment they can also be described as a single-time 
measurement of a system, whose wave function satisfies specific boundary conditions on the 
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There is one point that needs to be highlighted in our discussion. It is well 
accepted in quantum theory that the presence of an intermediate measurement 
affects the state of the system and for this reason the experimental outcomes 
depend strongly on the whether an intermediate measurement has been carried 
out. However, the same statement could be made for sequential measurements 
in a classical probabilistic system. However, as we shall explicitly prove in 
section 4, classical probability cannot give rise to the degree of contextuality 
inherent in quantum theory even if the coupling to measurement devices is 
taken into account. To demonstrate this in detail, we need first to expand on 
the construction of probabilities corresponding to the thought-experiments of 
Fig. 3. 

3.3 POVMs and their applicability 
3.3.1 POVMs and their properties 

Unlike single-time measurements, sequential measurements cannot be described 
by the spectral projectors of a self-adjoint operator. It is therefore necessary 
to employ a generalisation of the notion of quantum mechanical observables, 
namely the Positive-Operator- Valued Measures (POVMs). 

A POVM is map that assigns to each measurable subset U of a sample space 
fi a positive operator tl(U), such that 

- ii{n) = i, n(0) = o 

- ii(u uv) = ti(u) + h(V), u n v = 0. 

A POVM can therefore define a probability density on by 

p(U) - Tr (pU(U)) . (3. 5) 

POVMs are generalisations of PVMs, usually thought to correspond to unsharp 
measurements. Indeed, if we denote by A the points of the spectrum of a self- 
adjoint operator A, we may define a POVM as 

U(U)= J d\x 5 u(\)\\){M, (3. 6) 

in terms of a family of smeared characteristic functions Xu ■ (F° r smeared char- 
acteristic functions and their properties see appendix A). 

For sufficiently coarse sets U, the positive operators tl(U) are close to true 
projectors. One may estimate that 

\Trp(U(U) - tl(U) 2 )\ < Trp\U(U) - tl(U) 2 )\ 

< J d\\ x S uW ~ lx 5 uW} 2 \ <c8, (3. 7) 

walls. This is the reason we shall ignore them and study exclusively measurement schemes 
similar to that in Fig. 3. 
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with c a constant of order unity. Also for states p with spreads in A much larger 
than 5 we may compute (see the Appendix A) 

\Trp(U(U) - tl(U) 2 )\ < c'^Trpfl(U), (3. 8) 

where L is the size of U . 

When a POVM II is defined on a sample space = x Q 2 , we denote 
the POVMs II(Oi,-) and II(-,Q 2 ), defined on Q 2 and Qi respectively, as the 
marginal POVMs of IT. 

3.3.2 POVMs for sequential measurements: non-go theorems 

One possibility that should be first considered is that the arguments leading 
to equation (2. 15) are somehow inadequate to account for the multi-time 
experiment we considered earlier and that a different procedure should allow us 
to define proper probabilities for multi-time measurements. 

The most general way to define a probability distribution that is linear with 
respect to the density matrix is through POVMs. One could therefore con- 
jecture the existence of a POVM on the sample space ® n Q. n for the n-timc 
measurements. There are, however, limitations 6 

Proposition 1. There exists no POVM for n-time measurements of an observ- 
able x compatible with the single-time predictions of quantum theory, unless x 
commutes with the system's Hamiltonian H. 

We consider without loss of generality a POVM for a two-time measurement. 
We denote by VL the spectrum of x, and by P{U) the spectral projectors of x, 
U C VL. The POVM E(-, t\; •, t 2 ) assigns to each pair of sample sets U\, U 2 C O a 
positive operator E(U\,ti; U 2 , t 2 ). It should be compatible with the single-time 
predictions of quantum theory, namely 

Tr (pEiUuh; 0, t 2 )) - Tr (pe ikt ^ P^Oe""" 1 ) (3. 9) 

Tr (pE{QM:U 2 ,t 2 )) =Tr (pe^P^e"^* 2 ) . 

Since this should hold for all p, the marginals of the POVM E are PVM's, 
namely 

B(C/i,ti;fi,i 2 ) = e l " tl P(U 1 )e- l " tl (3. 10) 

E(n,h;U 2 ,t 2 ) = e l ^P{U 2 )er l{lt \ (3. 11) 

6 The results implied from propositions 1 and 2 seem to be well accepted in the consideration 
of sequential measurements. Even though they are rather elementary, we are not aware of any 
explicit proof in the literature, and for this reason we include the proof in the text. They are 
essential for the development of the arguments in Section 4. 
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There is a general result (see e.g. Theorem 2.1 of reference [15]) that any 
POVM, whose marginals are PVMs, is itself a PVM, it commutes with its 
marginals and can be written as the marginals' product. Hence 

[ e ^*ip(?7 1 ) e -^i ) e iA * 2 P(f/ 2 )e- <A * 2 ] =0, (3. 12) 

and since this property holds for all t\,t 2 and subsets Ui,U 2 , it follows that 
[x, H] = 0. The probability measure (2. 15) is additive in that case and the 
correlation functions are real- valued. It follows that in the generic case the prob- 
abilities for n-time measurements cannot be modeled by a stochastic process, 
because the latter can only be defined if a compatibility condition of the form 
(3. 10) is satisfied (see the discussion in section 3.5.1). 

One, however, may object that the requirement that the single-time marginals 
of the POVM's are projectors is too stringent. The physical set-up of a two-time 
measurement is different from that of a single-time measurement, and there is 
no a priori reason for the marginals of the POVM to reduce to those of the 
single-time measurement. One cannot argue so much against equation (3. 10). 
If ti < t 2 the measurement outcomes at t\ should not depend on whether or not 
we choose to perform a second measurement later. However, Eq. (3. 11) may 
very well be problematic, because the physical system has already interacted 
with a measuring device, while in the single-time measurement the evolution 
has been purely unitary. 

Still, even this less restrictive case (namely only equation (3. 10) being 
satisfied) leads to the same conclusions. The proof involves only a few small 
changes from the earlier one, but we reproduce it here for concreteness. 

We consider without loss of generality t\ = 0, t 2 = t. For the sample sets 
U\, U 2 = fl — Ui, Vi, V 2 = £1 — Vi we define the positive operators 

Eij = E(Ui,0;Vj,t), (3. 13) 

Ki = E(Ui,0;Q,t), (3. 14) 

U = E(Sl,0;Vi,t). (3. 15) 

By assumption Ki is a projector, while L; is a general positive operator. 

By definition < E^ < Kj, for both values of i 7 . Since Ki is a pro- 
jector, E^ lies in the closed linear subspace corresponding to Kj, with every 
j taken separately. Hence En commutes with K\ and Ei 2 commutes with 
K 2 . Since K 2 = 1 — Ki, also [En,E i2 ] = 0. Since Li = E a + E i2 , the op- 
erators En,E i2 also commute with Lj. Again by definition < E^ < Li 
and since E^ lies in the closed-linear subspace corresponding to Kj we obtain 
< E^ < KjLiKj = KjLi. Since 1 = J2ij Eij ^ J2j KjLi < 1, we obtain that 
Eij — LiKj. We therefore conclude 

7 A < B means that B — A is a positive operator. 
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Proposition 2. A POVM for sequential measurements satisfies (3. 10), only if 
its marginals commute. 

This implies in particular that the marginal E(Cl, t\; Uz,t2) cannot be a POVM 
of type (3. 6) corresponding to an unsharp measurement of x, unless [x, H] = 0. 

The assumption that equation (3. 10) holds is valid for ideal measurements, 
like for instance the ones corresponding to measurements of observables with 
discrete spectrum. The generalisation of this result for non-ideal measurements 
is straightforward. 

We conclude that we cannot construct POVMs that provide the probabili- 
ties for multi-time measurements in quantum systems, if we require that they 
reproduce faithfully (or even approximately) the predictions of single-time quan- 
tum theory. This, however, does not imply that we cannot construct any such 
probabilities in a way compatible with the predictions of single-time quantum 
theory. POVMs provide the most general way to construct probability densities 
on a sample space as a linear map of the quantum state p. If we break linearity 
(and hence assume that the resulting construction will not respect the convexity 
properties of the space of states) such an assignment may be possible. How- 
ever, probabilities defined through such a procedure cannot be obtained from a 
measure of the form (2. 10) (corresponding to a Markov process), because such 
a measure would be linear with respect to the initial density matrix. We shall 
take up this issue again in section 4.2. 

3.4 Constructing POVMs for sequential measurements 

Propositions 1 and 2 above demonstrate the degree of contextuality in sequential 
quantum measurements. They do not imply, however, that no POVMs exist that 
provide the probabilities of sequential measurements. Indeed, probabilities for 
sequential measurements have been considered extensively in the literature. We 
shall construct such POVMs in detail, in order to demonstrate that they are 
not only mathematically natural, but also physically reasonable. 

3.4.1 Ideal measurements 

We first consider the case of measuring an observable x = ^ i \Pi with discrete 
spectrum. Writing Qi = e lHt P i e~' tHt , we construct the probabilities for the 
most-fine grained two-time results 

p(i,0;j,t) = TriQjPipoPi) = (i\po\i)\(i\e- iAt \j}\ 2 (3. 16) 

Irrespective of the interpretation of the measurement process, the probabili- 
ties (3. 16) refer to the most elementary alternatives that can be unambiguously 
determined in the experimental set-up corresponding to the sequential measure- 
ment of x. Therefore, they can be employed to construct probabilities for general 
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sample sets U\, U 2 on the spectrum f2 of x, namely 

p(U 1 ,0;U 2 ,t) = ]T Sp(i.0;j,t). (3. 17) 

The total probability is normalized 

p(fi, 0; fi, t) = ^ TriQjPiPoPi) = 1. (3. 18) 

y 

Hence Eq. (3. 17) defines a POVM for two-time measurements. 

Note that as a result of the construction above, the probabilities p{Ui, 0; U 2 , t) 
for general samplings do not depend on U\ and U 2 through the corresponding 
projectors Pu 1 and Pjj 2 - This strengthens the conclusion of section 3.2 that 
there is no quantum logic interpretation for multi-time measurements. In clas- 
sical probability we use the same mathematical object (a characteristic function 
for a subset of the sample space) to represent both a concrete measurement out- 
come and a statement about a measurement outcome. In multi-time quantum 
measurements this is no longer the case: a coarse-grained projector Pjj cannot 
represent a proposition that the outcome of the corresponding measurement lies 
within U : it can only represent a genuine physical event [23] . 

3.4.2 Continuous spectrum 

The situation is more complex when one considers observables with continuous 
spectrum, such as position. In that case there are no fine-grained projectors and 
the choice of the elementary quantum probabilities, from which one may build 
the general probabilities for measurement outcomes cannot be made uniquely. 
We shall see that this implies that the probabilities are very strongly depen- 
dent on minor properties of the measurement device, so strongly in fact as to 
put into question whether the definition of a statistical ensemble is practically 
meaningful. 

The immediate generalisation of Eq. (3. 16) for the measurement of an 
operator with a continuous spectrum is 

p{x u Q-x 2 ,t) = \{x 1 \p \x 1 )\ 2 \{x 1 \e' tflt \x 2 )\ 2 . (3. 19) 

This, however, does not define a proper probability density, because it is not 
normalized to unity 

J dx\ J dx 2 p(xi, 0; x 2 , t) = 00. (3. 20) 

This is due to the fact that there can be no measurements of infinite accuracy. 
One has, therefore, to take into account the finite width of any position mea- 
surement, say S. This quantity depends on the properties of the measuring 
device-for example the type of the material that records the particle's position. 
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The simplest procedure (but not the most natural one) is to consider the 
measurement of a self-adjoint operator x$ = J2 i XiPf , where Pf is a projection 
operator corresponding to the interval [xi — |,a;j + |]. In that case we may 
immediately construct the fine-grained probabilities 

Ps(i,0;j,t) = Tr(Q s j P l s p Pf), (3. 21) 

from which wc may construct probabilities for general sample sets U\ and U 2 : 

PsiUuO-M) = ]T ]T TriQSpfpoPf). (3. 22) 

Strictly speaking one may only consider sample sets that are unions of the 
elementary sets that define our lattice. If, however, the size of the sample sets 
L is much larger than 5, we may approximate the summation with an integral. 
This amounts to defining the continuous version of probabilities (3. 21) 

Ps{xi,h;x 2 ,t 2 ) = (3. 23) 

where we denoted P* = j^J/l dy\y)(y\. 

To construct the probabilities p$(Ui,0;U 2 ,t), we split each set Ui into mu- 
tually exclusive cells u a i of size 5, such that 

U a u ai = Ui (3. 24) 

UaiDupi = %,a^(3. (3. 25) 

If we denote select points x ai e w Qi , for all i (x ai may be the midpoint of u ai ), 
we may construct the probability ps(Ui, 0; U 2l t) 

p s (U 1 ,0;U 2 ,t) = ^2^2ps(xai,O;x 0j ,t) (3. 26) 

In the limit that the typical size of the sets U\, U 2 is much larger than <5, we 
obtain 



ps(U l ,t 1 \U :j7 t2) = 




dx 2 ps(x-L,t 1 ;x2,t 2 ), (3. 27) 



In other words, the objects j7Ps{Ui, ti\Uj,t 2 ) play the role of probability densi- 
ties. 

Equation (3. 27) suggests that the two-time probabilities are given by the 
positive operators 

n(I7 1 ,0;i; 2 ,t) = 1 / dx! f dx 2 P s Xl Q X2 5P 5 X2 , (3.28) 

JUx JU 2 
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which fail to define a POVM, because they are not normalized to unity: 
they differ from 1 by a term of order 0(S). This is an artefact of the way we 
implemented the continuous limit in going from probabilities (3. 27) to those 
of (3. 28). An error of the order of 0(6) is reasonable, since the sampling error 
is itself of the order of 6. 

It is easy to remedy this problem by working with POVM's for the single- 
time probabilities. We consider a POVM U S (U) — J v dxW x for position that 
satisfies the following properties 

J dx fl s x = i, J dxxli x = x. (3. 29) 

For example one may consider the Gaussian POVM 



AS-/ 



dx— ^e-^-^^ 2s2 \x)(x\. (3. 30) 



Then the operators 

R s (U u Q;U 2 ,t) = f d Xl f dx 2 J^ 1 e i " t tli 2 e- ikt Jt^ 1 (3. 31) 
JUi Ju 2 

satisfy all properties of a POVM including the normalization condition. It is 
easy to check that within an error of 0(6) the probabilities defined by the 
POVM (3. 31) coincide with those of (3. 28) (with V2^6 in place of 6). The 
generalisation to n-time measurements is straightforward 

R s (Ui,ti; U 2 , t 2 ; ■ ■ ■ U n , t n ) = 

d Xl [ dx 2 ... [ dxne^Jt^e^^-^Jfl. 
IU-l Ju 2 Ju n 

... e <*(*»-*»-i)n XBe -**(*»-*»-i)...^^' e -<*(*2-ti)^^' e -<**i. (3. 32) 

3.5 Basic features of the constructed POVMs 

3.5.1 The inequivalence with stochastic processes 

The probability densities defined by the sequence of the POVMs (3. 32) for all 
values of n as 



/ 



pi(x u tr,x 2 M\---]x n ,t n ) = Tr[p R\U u tv,U 2 M\---U n ,tn)]. (3. 33) 

This result in conjunction with the theorems of section 3.3, demonstrate 
the inequivalence of quantum probabilities for multi-time measurements with 
those that can be obtained by a classical stochastic processes. The sequence (3. 
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33) does not define a probability measure on the space of paths, because the 
compatibility condition necessary for the definition of such a measure 

Pn—1 (^1 ) ^1 ! • ■ ■ j %i— 1 ) ^i— 1 ? • ■ • j Ki+li 5 • ■ ■ ! *£n 7 tn) 

J dx i p 5 n (x 1 ,t 1 ;...;x i ,t i ;...;x n ,t n ) (3. 34) 

for all possible i = 1, 2, . . . , n, is not satisfied. 
A weaker version is satisfied instead, 

P 5 n -i(xi,ti;x 2 ,t 2 ; . . . ;x n -i,t n -i) = 

J dx n pf l (xi,h;x 2 , t 2 ; ■ ■ .;x n -i,t n -i;x n ,t n ), t n > i„_i > . . . > i 2 > *i (3,. 35) 

namely only if we integrate over the variables defined at the final moment of time 
in the n-time distribution, do we obtain the n — 1-time probability distribution. 

3.5.2 Strong dependence on the apparatus's resolution 

Since the functions (3. 33) provide a well defined system of joint probability 
densities, one could consider defining an generalisation of stochastic processes 
that would reproduce the predictions of quantum measurements. There is how- 
ever a problem: the POVM (3. 32) and the corresponding probability densities 
depend very strongly on the parameter 8. 

This is a direct consequence of the fact that the probabilities (3. 33) arise 
out of the non-additive measure (2. f5). Suppose we consider two different 
measurement devices, one characterized by a value 8 and another by a value 
28. In any reasonable measurement scheme one would expect that the two-time 
probabilities for sample sets U\ and U 2 would not be appreciably different if 
their size is much larger than 8. However, this turns out not to be the case. It 
is easier to see this in the discredited expression (3. 21). 

A projection operator P£ s centered around x with width 28 can be written 
as the sum P s _s + P S ,±- When we construct the elementary probabilities 

corresponding to P£ s , which correspond to sets of width 28 they will differ from 
the probabilities for the same sets, when the latter are constructed by sets of 
width 8. Their difference will be the interference term 

2 Re d 5 (xi + 8/2, x x - 8/2, h : x 2 ,t 2 ) = 
2ReTr (e^^-* 1 )^^-^^-* 1 )^^^!)^^^) . (3. 36) 

The modulus of the 'interference' term is, in general, of the same order of 
magnitude with the probabilities p$ and p 2 $ themselves. Hence, when we sum 
(or integrate) over the probabilities corresponding to the cells of width 8 or 28 
to construct the probabilities ps{U\, t\\U 2 , t 2 ) and p 2 s{U\, t\\U 2 , t 2 ) for generic 
large sample sets U\ and U 2 the results differ by an amount of 
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es(Ui,ti;U2,t2) = Re / dx\ / dx 2 d${x\ + 8/2, x\ - 8/2, t\ : x 2 , i 2 )-(3. 37) 

This term is of the same order as the probabilities themselves [23] , a fact pointing 
to the strong dependence of the results on the resolution 8. Different values of 
8 lead to very different probabilities. 

We may also see that in the POVM (3. 31). For the special case of a free 

-2 

particle H — J~ (3. 31) equals 

x + x' 



(x\R s (Ui,0;U 2 ,t)\x') = / dxi / dx 2 exp I -i— (x - x')(x 2 



2 



x exp ^--( — + - 2 )(x - ,r - 2 , 2 2 ) (3- 38) 

This can be written as 

(x\R s (U u 0;U2,t)\x') = ^XT^,u 2 (rn^)xU^) 

xexp(-l(^ + i L)(.x-.x') 2 ), (3. 39) 

where T x denotes the translation operator on R, \ lSi the Fourier transform 
of the characteristic function \ and \u is a smeared characteristic function of 
position. If the size of U\ is much larger than S, then one may approximate Xui 
by an exact characteristic function, thus obtaining 

x exp (-^(-p- + - x'f) ■ (3. 40) 

The matrix elements of the POVM involve a product of two terms that depend 
on the sample sets (and not on S) with a Gaussian term that is very sensitive 
on d and does not depend on the sample sets. This clearly demonstrates the 
strong dependence of the POVM (3. 38) on S, which persists even for very- 
coarse sample sets. It is easy to verify that the norm of the difference between 
two positive operators R s (Ui,0; U 2 , t) and R s (U\, 0; U2, t), for different values 8 
and 8' respectively, is of the order of the norm of the operators themselves. For 
most states therefore the probabilities will be very sensitive on the resolution 8. 

Note also that as 8 -» the POVM (3. 32) docs not converge to a PVM 
that provides an ideal value for probabilities, as is the case in single-time mea- 
surements. Instead, lim^o R s (Ui, 0; U2, t) = 0. This behavior is a consequence 
of the use of the square root of the POVM n in (3. 32). This is necessary in 
order to guarantee that (3. 32) is a proper POVM, normalized to unity and 
with the correct dimensions to define a probability density on R". 
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3.6 Quantum measuring device 

Our previous derivation of the probabilities for multi-time measurements was 
based on an operational description of the quantum measurement process, namely 
the assumption that the measuring device is classical and that the effect of the 
measurement is the " reduction of the wave packet rule" , either corresponding to 
ideal measurements, or to non-ideal ones (in which case we employ POVM's). 

One may however object that a full quantum mechanical treatment of the 
measuring device may presumably lead to a different result. We shall argue 
here that this is not the case. It is well known that the standard treatment of a 
quantum measurement device (together with von Neumann's reduction rule) is 
equivalent to the description of the probabilities for a quantum system with a 
PVM [24] . For observables with continuous spectrum measurements are usually 
unsharp and the sampling of the quantum system turns out to be equivalent 
to a POVM that depends on specific properties of the interaction between the 
quantum system and the measurement device. This dependence is, however, 
relatively weak as the probability for sufficiently coarse-grained sets is largely 
insensitive to such details [25, 26]. 

The generalisation of the results above for sequential measurements is straight- 
forward. One only needs to introduce a different measurement device for each 
measurement. If the devices are initially uncorrelated, it is easy to demonstrate 
that in ideal measurements (and discrete pointers) probabilities are provided 
by a POVM of the type (3. 17). However, if we employ a discrete pointer for 
the measurement of position the dependence of probabilities on the resolution 
arises out of the explicit correlation between the continuous variable x and the 
discrete basis for the pointer. Effectively we return to a POVM like (3. 21) with 
the same strong dependence on 6. 

A continuous pointer can be shown to lead to an equation of the form (3. 
31), with the smearing function determined by the initial state of the apparatus. 
To see this, we consider the following toy model. Let x by the position of 
the particle we want to determine, and let the particle be prepared in a state 
|-0o) • We assume two identical measurement devices each in state l^o), initially 
uncorrelated with the particle and with each other. The state of the total system 
will then be initially |V'o}|^'o}|^ , o)- The pointer variables are qi and q 2 for each 
device. We assume that the self-dynamics of the devices is negligible, that the 
particle's Hamiltonian is H and that the interaction Hamiltonian is 

H^t = ft 1 {t)x®k 1 ®l + ft 2 {t)x®l®k 2l (3. 41) 

where k\ are the conjugate momenta of cji, and fa (t) is a function of time sharply 
concentrated around ij. At the limit of instantaneous measurements the state 
of the system at time t = t 2 is 

|Vw(* 2 )) = J dh J dk 2 (e-^e-^-'^e-^e-^IV'o)) 

<8>|fci)<fci|*o) ® IfeXMtfo}. ( 3 - 42 ) 
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The probability distribution for the pointer variables is 

J dx \{x,q u q 2 \4, tot {t 2 ))\ 2 . (3.43) 

This coincides with that given by the POVM (3. 31), if we identify VTT^ = 
j -^e~ lk<<x ~ y \k\^ q) . It is easy to verify that the corresponding Tl y defines 
a POVM for a single-time measurement of position. In this simple model the 
resolution 5 of the device is determined by the spread in momentum of the 
apparatus's initial state. In a more realistic model, the resolution 5 receives 
contributions from the self-dynamics of the detector (and of a possible environ- 
ment), from the finite duration of the measurement interaction and from the 
accuracy in the readings of qi , q 2 ■ 

The presence of an environment does not change the essence of the arguments 
presented previously. The consideration of an enlarged Hilbert space that also 
contains the degrees of freedom of the environment does not make any difference 
to the mathematical formulation of the issue. The only possible way to cancel 
the strong dependence of the probabilities on the resolution 5, is to assume 
that the effect of the environment causes interference terms like (3. 36) (where 
now the projectors refer to the values of the pointer rather than those of the 
measured particle) to become rapidly small 8 . 

In general, the interference terms (3. 36) can only be suppressed (for a 
sufficiently generic initial state), if the environment causes the reduced density 
matrix of particle+apparatus to be diagonalisable in the factorized basis |z)|aj), 
where \i) are the eigenstates of the measured observable and |a,) the pointer 
basis in the apparatus's Hilbert space. This is in general not possible as can be 
seen from a very general theorem [27] (and in a different but related context by 
[25] ) . Indeed if this diagonalization took place we would have a full resolution 
of the so-called macroobjectification problem by environment-induced decoher- 
ence, which is known not to be the case 9 — see the discussion in [28, 29, 30]. 

Hence we conclude that the strong dependence of the multi-time proba- 
bilities on the properties of the measurement device is unavoidable, whether 
one considers the formalism of quantum theory as an operational description, 
or whether one considers any minimal generalisations that involve a quantum 
mechanical treatment of the measuring device. 

8 The environment is coupled to the measuring apparatus and not directly to the particle. If 
that were not the case the particle would exhibit fully classical behavior and its measurement 
would not be different from that of a classical probabilistic system. 

9 Whether environment-induced decoherence solves the full measurement problem in in- 
terpretations other than the Copenhagen one (e.g. many-worlds or consistent histories) is a 
different issue, unrelated to the aims of this paper. For the purposes of the present argument, 
we are only interested in the mathematical statement that the diagonalization in the basis 
|i)|a;) cannot implemented in any closed system (however large) that evolves unitarily. 
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3.7 Consequences 

3.7.1 Contextuality of measurements 

The first result of our analysis of sequential measurements is the breakdown 
of quantum logic. Unlike the single-time case, a proposition about a measure- 
ment outcome is not represented by a projection operator. The probabilities 
p(Ui,ti;U2,t2) do not depend on the sample sets through projectors and are 
very different in different experimental setups. A two-time YES-NO experiment 
will lead to different probabilities from those obtained by the experiment of Fig. 
3. 

This result is complementary to the Kochen-Specker theorem: it is in general 
not possible to attribute specific values to sets of observables, even commuting 
ones, without specifying the context, namely the concrete experimental set-up. 
In other words, one cannot define a sample space for the possible outcomes of an 
observable, without referring to the specific measurement being implemented. 

The relevance of contextuality in sequential measurements (both factual and 
counterfactual) has been studied extensively in the literature. Albert, Aharonov 
and D' Amato employed the result of Ref. [13] concerning an ensemble that is 
both pre- and post-selected through measurements at times U and tf [31]. They 
showed that it is possible to retrodict the results of specific measurements that 
could have been carried out at any moment of time in the time interval [ti,tf]. 
It is important to remark that this retrodiction can be applied to incompatible 
measurements, i.e. ones corresponding to non-commuting observables. A similar 
result is also obtained by Kent [32] , who argues that retrodiction in a quantum 
theory that purports to describe individual systems (consistent histories) leads 
generically to contrary inferences. 

The discussion in this paper is within a slightly different context than the 
ones of the references above. We are only interested in providing a probability 
measure for the results of sequential measurements that have actually taken 
place. The intermediate measurement device is part of a specific experimental 
set-up and we do not consider any counterfactual statements (about retrodic- 
tion). In any case, the situations studied here typically involve probabilities 
that are spread over different alternatives: typically only trivial inferences can 
be made. 

The standard proof of the Kochen-Specker theorem assumes that it is pos- 
sible to assign definite values to commuting physical observables in individual 
systems prior to measurement, which, while reasonable, is not an statement 
amenable to empirical verification. Moreover, it involves an interpretation- 
dependent assumption that it is possible to extrapolate the rules of quantum 
theory from the description of statistical ensembles to that of individual sys- 
tems. In sequential measurements however the dependence of the measurement 
outcomes on the specific experiment is direct in terms of concrete empirical 
data, and makes no assumptions other than that quantum theory provides the 
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correct probabilities for the measurement outcomes in statistical ensembles. 

Contextuality can also be inferred from Bell's theorem and its generalisations 
or from Wigner's theorem about the lack of a joint probability distribution for 
non-commuting observables. However, in the cases above one may provide al- 
ternative explanations: in the former case one may attribute the failure of Bell's 
inequalities to non-locality, while in the second one may invoke the inability to 
perform simultaneously measurements of incompatible observables. There arc 
no such limitations, when the argument for quantum contextuality is phrased in 
terms of the probabilities for sequential measurements. It is in principle possi- 
ble to measure multi-time probabilities in different experimental set-ups. If the 
results of our analysis are correct, these probabilities will differ strongly, thus 
providing irrefutable empirical evidence about the contextuality of quantum 
events. 



3.7.2 Inferences and conditional probability 

Conditioning it is a very important part of classical probability; it is the math- 
ematical implementation of the idea that when we obtain information from an 
experiment, we need to modify our description of the system (i.e. the probabil- 
ity distribution) in order to account for the new information. The prototype of 
conditioning is the notion of conditional probability, i.e. the probability p(A\B) 
that A will take place when we have verified that B occurred 

vUW-Oqg. (3.44, 

It is sometimes suggested that the "wave packet reduction rule" can be inter- 
preted as a quantum version of conditional probability. Our results suggest that 
this is not the case. Such an interpretation is only possible, when 'conditioning' 
refers to the most fine-grained recordings of a physical system's properties. If 
we attempt to employ this rule to account for coarser alternatives, we inevitably 
lose information in the process and cannot obtain correct physical predictions. 

In the classical theory conditional probability can be employed to define 
logical implication. If the conditional probability p{A\B) for an event A given 
that B was realized equals 1. If in an experiment we verify the property B, 
we may expect that the property A will be almost surely satisfied. In quantum 
theory the situation concerning implication is more subtle. There is a strong 
distinction between prediction and retrodiction. 

The fact that Eq. (3. 34) does not hold implies that retrodiction is prob- 
lematic. A two-time measurement involves a different experimental set-up from 
that of a single-time measurement. The single-time probability for a measure- 
ment at time t is different from any marginal obtained by tracing out the results 
of any measurement at any time t' < t. It is, therefore, impossible to make any 
inference (or any probabilistic statement) about what would have happened if 
a measurement had taken place earlier, solely from the data obtained at time t. 
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It is necessary to provide the specifics of the intermediate measurement scheme. 
On the other hand, the validity of Eq. (3. 35) implies that prediction works 
the same as in the classical case. Tracing out the results of a later measurement 
yields the same probability distribution as if the measurement had not taken 
place, hence it is in principle possible to make inferences from the data at time 
t about the results of measurements that takes place at time t' > t. 

3.7.3 Strong dependence on the measuring device's resolution 

In single-time quantum theory we know that different experiments of the same 
type are expected to yield identical results, up to sampling and systematic er- 
rors. This is guaranteed by the spectral theorem, and it is epistemologically very 
desirable because the results of similar experiments can be immediately com- 
pared. But in multi-time measurements even two experiments that are identical 
in all details (preparation of the measuring device, source of particles, design 
of the experiment) but the resolution of the measuring device, will lead to dif- 
ferent probabilities and correlation functions. This is a very stronger effect and 
in principle observable. It is a source of doubt about whether any meaningful 
information can be obtained from such experiments. 

At a practical level we know that experiments yield more reliable results, 
when we have expended time and effort to minimis the errors, which may arise 
from either sampling inaccuracies or from the finite resolution of the measure- 
ment device. Copenhagen quantum theory assumes that any results we obtain 
will make reference to the specific set-up, but even when we restrict our ex- 
pectations to that case, common sense suggests that the smaller the error, the 
more reliable our experimental results will be. The event frequencies should 
converge to some ideal values that would characterize, if not the measured sys- 
tem in itself, at least the general design of the experiment. This expectation is 
fulfilled in single-time quantum theory. The dependence of the typical POVMs 
for unsharp measurements on the error (or resolution) 5 is rather weak, and 
for sufficiently coarse-grained samplings the corresponding positive operators 
are close to true projectors. The probabilities corresponding to such samplings 
probabilities will therefore be the same in all measurement devices, and they 
will coincide with the probabilities obtained from YES-NO experiments. But in 
multi-time measurements this is no longer the case. The POVM's dependence 
on 5 persists even for very coarse samplings. When we increase the resolution, 
we do not obtain "better" results, we simply obtain different results. 
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4 Sequential measurement in hidden- variable the- 
ories 

4.1 Bohmian mechanics 

The results of section 3.5 suggest that the probabilities for sequential measure- 
ments are very sensitive to even minute changes of the measuring procedure. 
Hence even if we accept that such probabilities can be defined from the sta- 
tistical data, a question is immediately raised. Why are multi-time quantum 
measurements so different from single-time ones? Clearly, an answer to this 
question cannot be provided within standard quantum theory, because the is- 
sue itself arises as a consequence of the theory's basic postulates. One would 
have to enlarge the domain of standard quantum theory and essentially work 
with hidden variables. 

The most important hidden variables theory, both because of its long his- 
tory and its intrinsic strength, is Bohmian mechanics [33]. In this theory the 
additional variables are the particles's position, which evolve according to the 
modified Newton's equations 

. . d x ^{x,t) 
mx = lm ; — , 

where the wave function ^{x, t) is a solution of Schrodingcr's equation. 

The dynamical equations (4. 1) are usually supplemented by the condition 
of quantum equilibrium, namely that in a statistical ensemble of particles the 
probability density for the particle's position is given by Born's rule: p(x, t) = 
\*(x,t)\\ 

Assuming quantum equilibrium, it is easy to construct a probability density 
for the outcomes of sequential measurements. The particle's position and the 
wave function satisfy a set of differential equations and are fully deterministic. 
Hence any trajectory can be fully specified by the knowledge of the initial condi- 
tions: the position xo and the wave function ^q(x). Let us denote as x(t; xq, ^o] 
the solutions to (4. 1); they can be viewed as functions of the random variable 
xo- 

The probability that the particle lies in the set U\ at time ti, in Ui at time 
t2--- and in U n at time t n then equals 

p n (U u t 1 ;U 2 ,t 2 ;...;U n ,t n )) = J dx \^ (x )\ 2 

^XuA x ( t ^ x o,^o]}xu 2 [x(t2;xo,^o\} ■ ■ ■Xu 2 [x(tn;x ,^ \} ■ (4. 2) 

The tower of all n-time probabilities satisfy by construction the compatibility 
condition and thus defines a measure on the space of paths <i//[a:(-)], which 
depends only on the initial wave-function and the Hamiltonian operator of the 



(4. 1) 
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wave function's evolution. This measure fully reproduces the predictions of 
standard quantum theory at a single moment of time. 

There is no contradiction with our results at section 3.3. The non-go the- 
orems proved there refer to probabilities that can be constructed via POVMs: 
the n-time probability densities are linear functional of the initial density ma- 
trix. Clearly, this is not the case here as the initial state enters in a non-trivial 
way in the definition of the random variables x(t; Xq, ^/q\. For the same reason 
the stochastic process corresponding to (4. 2) is non-Markovian 10 . It is however 
local-in-time, because the densities corresponding to (4. 2) factories 

p n (xi,ti;X2,t2\ ■ ■ .;x n ,t„) = \^(x,ti)\ 2 

[* o] (x 2 ))S(x 2 , g t2 ,t 3 [* o] (a*)) • • ■ 6{x n -!, 9t n ^,t n [*o] (*„)), (4. 3) 

where g t ,t' [^o],t < t' is the backwards- in-time evolution operator corresponding 
to the equation of motion (4. 1). 

It would seem from the above expression that multi-time probabilities in 
Bohmian mechanics are different from those of standard quantum theory. The 
multi-time probability distributions in standard quantum theory cannot be ob- 
tained from a probability distribution. A difference in the probabilistic outcomes 
of multi-time measurements between Bohmian mechanics and quantum theory 
has been suggested before in [34, 35]. In these references the predictions of 
Bohmian mechanics were compared with correlation functions of the form (3. 
3), which have no immediate operational interpretation, while here we compare 
them with the probabilities for sequential measurements, which can in principle 
be determined empirically. Our analysis also shares some features with that of 
Hartle [36]. 

An immediate objection can be raised to the analysis above. Bohmian me- 
chanics refers to the properties of things in themselves and not to measurement 
outcomes. To obtain the measured probabilities one would have to model the 
interaction of the quantum system with a measuring device. The Bohmian de- 
scription of quantum measurements has been developed in [37]-see also a related 
discussion about Stochastic Mechanics in [38]. In these references it is argued 
that the reduction of the wave packet rule can be obtained from Bohmian me- 
chanics after the interaction of a system with a measuring device has been taken 
into account. One would therefore expect that in sequential measurements the 
predictions of quantum theory should be reproduced. 

We next examine this issue in more detail. We consider a two time measure- 
ment. Let x be the particle's position, Q 1 and Q 2 the variables for the first and 
second measurement device respectively, and X a pointer function of Q\ or Q 2 , 
the range of which is the space of possible alternatives in each measurement. 
The pointer function may be either continuous or discrete. The total system 

10 If we distinguish the two roles of the wave function as probability distribution and as 
agent of dynamical evolution, then the stochastic process is Markovian. It is however not 
time-homogeneous, unless the wave function is an eigcnstatc of the Hamiltonian operator. 
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will be effectively described by a stochastic process analogous to the one defined 
by (4. 2). Assuming quantum equilibrium for the total system, the probabilities 
that the pointer X is found in a set U\ at ti, and in a set U 2 at t 2 equals 



where Q l (t, xo,Qo, are the solutions to the deterministic equations of motion 
for the variables Q written in terms of the initial condition. The marginal 
probability, in which the results of the first measurement have been traced out 
equals 



On the other hand the probability of a single-time measurement at time t 2 is 



p 1 (U 2 ,t 2 ) = / dx a dQ 2 \y a (x a ,Q 2 )\ 2 XU2 [X(Q 2 (t 2 ;x ,Q 2 ,*o})] ■ (4. 6) 



The crucial difference lies in the equations of motion for the pointer variable- 
there are no Qi variables in the expression for p 1 , because there is no measuring 
device at time t\. The probabilities (4. 5) and (4. 6) refer to different physical 
systems and as such they correspond to a different stochastic process. 

4.2 Contextuality from non-locality 

The analysis of multi-time probabilities in Bohmian mechanics above does not 
guarantee that the predictions of Bohmian mechanics coincide with those of 
standard quantum theory. We shall demonstrate now that the key property 
that permits that is the inherent non-locality of Bohm's theory. For a different 
derivation of the constraints from local realism to the conditional probabilities 
of sequential measurements the reader is referred to [39, 40]. Also related are 
the constraints that can be expressed in terms of "temporal Bell inequalities" 
[41, 42, 43]. The derivation we provide here refers to the most general case. 

We consider a general deterministic hidden variable theory The probabilis- 
tic description arises from an initial probability distribution for a statistical 
ensemble, which is related to the wave functions by Born's rule. To model 
a sequential measurement we assume the same variables x, Q 1 and Q 2 as in 
section 4.2. The wave function $f at t = is assumed factorized, namely 
^0(^0, Qo, Qo) = ipo{xo)4>i (Qo)02(Qo)- We also assume that the degrees of 
freedom Qi and Q 2 do not interact directly, and that at time t\ the particle has 





(4. 5) 
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not interacted with the degrees of freedom of the second measurement device. 
This implies that the value of x and Q 1 at time t\ does not depend on the initial 
value Qq. The conditions above are natural in any measurement process. 

We first consider a discrete pointer X, that takes values in the finite set f2 
of elementary alternatives. We assume that after a measurement the pointer X 
reveals the value of the function / of the variable x, so that at the time t when 
the measurement interaction has finished 



where xq , Qo & r e the initial values of the configuration variables of the system 
and apparatus respectively. We simplify the notation by writing X[Q(t, Xo, Qo] = 
X t (xo, Qo) and f[x((t, x , Qo] — ft(xo,Qo)- The variables x need not only refer 
to particle positions, but may in principle refer to other degrees of freedom, e.g. 
spin. 

We next assume that the single-time predictions of this theory coincide with 
those of standard quantum mechanics for ideal measurements, namely ones 
corresponding to a PVM Fjj on ft defined on the Hilbert space of the system's 
wave functions. 



where f^ ys (xo) refers to the evolution of the measured system in absence of 
the measurement device (equivalent to the quantum mechanical evolution with 
a Hamiltonian H). Equation (4. 8) holds in Bohmian mechanics when / is a 
function of position, but may be valid for more general configurational variables. 
The key assumption is that the operational predictions of quantum theory are 
valid, namely that the probabilities for measurements can be obtained by an 
application of Born's rule in the wave function of the system alone. This holds 
for ideal measurements. 

The crucial assumption is that after the first measurement has been com- 
pleted the system does not interact any more with the first device. The variables 
Q 1 do not appear any more in its equation of motion. This is essentially an as- 
sumption of locality for the interaction of the system with the measurement 
device. It implies that 



namely the values of x and Q2 after the second measurement depend on Q\ 
only through the value of x immediately after the first measurement. Using the 



X[Q(t, xo, Qo] = f[x{t, x , Qo)}, 



(4. 7) 




(4. 8) 



x t2 {xa,Ql, Qo) = x t2 (x tl (x , Ql), Qq) 
Ql (x , Ql Ql) = Q 2 t2 (x tl (x , Ql), Ql), 



(4. 9) 



33 



locality condition the probability (4. 4) is written as 



p 2 (U 1 ,h;U 2 ,t 2 ) = J dxodQldQll^ixoMiWQDlhWQl) 
xxi/! [X tl (xo,Ql))] Xu 2 [X t2 {x tl {xo,Ql),Qlj\ = 

J dx dQl\^\ 2 (x n )\^\ 2 {Ql) XUl [X tl (x ,QD] Xu 2 [x%'(ftAxo,Qh))] = 

J dzo^Qo IV>o| 2 (^o)|</>i| 2 (<3o) XUl [X tl (x , Qq)] X(x>v)-W 2 [ X U (%o, Qq)] = 

J dx Q dQl \ipo\ 2 (xo)\(t>i\ 2 (Qo) Xuiu(x°v°)-iu 2 [ x tA x o,Qn))] = 

J dx Q dQl |^o| 2 (a:o)|0i| 2 ((3o) Xu 1 u(x"y)- l u 2 [ftA x o,Ql)] = 

J dx \i} a \ 2 (x )xu 1 u(x^n- 1 u 2 [f sys (h, x ))\ = 

J dx \ip Q \ 2 (x )xu 1 [.ftT(a:o))] X(x>v)-W 2 [/t? S ( x o)] = 

J dx \^\ 2 (x ) XUl [fZ"{xo)] Xu 2 [ftTM] ■ (4. 10) 

In the above derivation we employed Eq. (4. 8) in going from the second to the 
third line; in going from the third to the fourth line we used Eq. (4. 7) and 
denoted by (x sys )^ 1 the inverse of the deterministic law of motion that takes 
x tl to Xt 2 in absence of apparatus; in going from the fourth to the fifth line we 
used the fact that X V\Xv 2 = XV\r\U 2 if U\ U U2 = 0- In going from the fifth to 
the sixth line we used Eq. (4. 7). Note that f^ ys (xo) stands for f(x s t ys (x^)). 

We proved therefore that the two-time probabilities coincide with those ob- 
tained for a stochastic process constructed solely from the degrees of freedom of 
the measured system. Hence a hidden variables theory that satisfies (4. 9) and 
reproduces the single-time predictions of standard quantum theory exhibits no 
contextuality in sequential measurements. Obviously p 2 satisfies the compati- 
bility condition (3. 34) and thus differs from the corresponding predictions of 
standard quantum theory. 

The result above does not apply to Bohmian mechanics, because the latter 
does not satisfy the locality condition (4. 9). As the wave function evolves, it 
becomes entangled in the variables x and Q\ after the first measurement-see 
the discussions in [44]. As a result of Eq. (4. 1), the equation of motion for x 
after the first measurement will explicitly involve Q\ and hence the Eq. (4. 9) 
will not be satisfied. Due to entanglement the measured system continues to be 
affected by the degrees of freedom of the first measurement device even if it is 
far away from it. We see therefore that what appears as strong contextuality of 
the empirical probabilities in standard quantum theory, in Bohmian mechanics 
it arises as a consequence of the role of the wave function as a carrier of non- 
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locality in entangled systems. 

In the Appendix B we provide a generalisation of this result for unsharp 
measurements, and also for hidden variable theories that can be modeled by 
a Markov process. It is important to remark that in deriving these results we 
need make no assumptions about the explicit form of the POVM for sequential 
measurements: any POVM that satisfies the assumption of Proposition 2 in 
Section 3.3.2 is adequate for this purpose. 

4.3 Other hidden variable models 

Is it possible to write hidden variable theories that reproduce the predictions of 
quantum theory for sequential measurements, without violating some form of 
the locality assumption? This is the same question that may be asked about 
hidden variable theories that violate the Bell inequalities. If such theories are 
to account for the single-time predictions of quantum theory one needs to intro- 
duce a probability density for these variables (either fundamental or emergent) 
and the usual calculus of probabilities almost guarantees that sequential mea- 
surements will be described by a stochastic process. 

The only conceivable hidden variable theories compatible with standard 
quantum theory in sequential measurements would be ones that introduce ad- 
ditional variables, other than the ones necessary to obtain agreement with the 
predictions of quantum theory. They would correspond to degrees of freedom 
fundamentally different from those of classical mechanics (e.g. 't Hooft's de- 
terministic quantum theory [45]). One may then assume that these variables 
are highly uncontrollable (or exhibit a kind of " coherence" within the elements 
of an ensemble) so that their statistical effect cannot be modeled by any prob- 
ability density. Hence it would not be possible to write a stochastic process 
for the multi-time probabilities of the theory, and consequently the arguments 
of section 4.2 would not follow. At the moment this is just a conjecture, for 
no such model has been explicitly constructed. Indeed, how could one effect 
the statistical descriptions of systems that are not described by probabilities? 
However, the existence of this possibility suggests that one might avoid both 
contextuality and non-locality by relaxing the condition that the full system is 
described by a probability theory that satisfies the Kolmogorov axioms. We 
shall consider this issue in the next section. 

5 Beyond probabilities 

5.1 Motivation 

In single-time measurements probabilities are defined naturally in terms of the 
projective geometry of the Hilbert space, through the spectral theorem. This is 
not the case for sequential quantum measurements. A choice of basis is necessary 



35 



to take into account the effect of the intermediate measurements. This results 
in a probability assignment that is not natural with respect to the Hilbert space 
geometry (i.e. it does not preserve quantum logic). Contextuality of probabili- 
ties follows. If one attempts to explain it away by hidden variable models, one 
needs to introduce non-local features in the interaction of the measured system 
with the apparatus. 

In standard quantum theory contextuality is generic, as witnessed by the 
Kochen-Specker theorem. In sequential measurements however the dependence 
on probabilities on even minor details of the measurement scheme appears as 
rather too extreme. It is a natural question then whether one can introduce 
an interpretative scheme that avoids it, without assuming on the same time 
non-locality at the fundamental level. The only way to do this, would be if 
the multi-time probabilities respected quantum logic, or in other words if they 
could be expressed in a way that respects the projective geometry of the Hilbert 
space. 

The consistent histories approach preserves quantum logic not at the level 
of measurement outcomes but at the level of individual quantum systems. The 
non-additivity of the measure (2. 15) is sidestepped by assuming that probabil- 
ities can only be defined for specific sets of histories (consistent sets), in which 
(2. 15) is reduced to an additive probability measure. Still, consistent histories 
does not avoid contextuality, whenever one attempts to make logical inferences 
based on the probabilities obtained by different consistent sets. This is natural 
from a mathematical point of view, because any probability assignment depends 
not only on the projectors representing the relevant properties of the system, 
but also on the consistent set. 

The only conceivable way to obtain uncontextual predictions for quantum 
theory would be if the non-additive measure (2. 15) could be employed for any 
sampling of measurement outcomes described by the corresponding projectors. 
Indeed, a key assumption the derivation of Bell's and Kochen-Specker theorems 
is that a probability distribution satisfying the Kolmogorov axioms can be em- 
ployed to model probabilities in a statistical ensemble. Hence a quantum theory 
based on a non-additive measure for multi-time measurements may potentially 
avoid the consequences of those theorem-see the discussion in references [4, 5] . 
However, empirical probabilities (that refer to the same measurement set-up) 
are always additive as they correspond to relative frequencies. The only way 
to relax the Kolmogorov probability conditions is to assume that frequencies 
do not always define probabilities, i.e. that they generically do not converge. 
The non-additivity of the probability measure (2. 15) would then provide an 
estimate of this lack of convergence. 

We shall explore this rather unconventional alternative in this section. It may 
seem rather contrary to the standard use of probabilities in quantum theory, but 
we believe it is worth studying, because it is the only conceivable alternative 
to the strong contextuality of standard quantum theory and non-locality. In 
any case its predictions arc in principle distinguishable from those of standard 
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quantum theory. 

Before proceeding in the further examination of this hypothesis of non- 
converging frequencies, we shall first provide a different motivation for its intro- 
duction. We shall argue that it may be a natural description of the statistical 
data of sequential measurements, solely from an operational point of view. 

5.2 Sequential measurements: stability of the sample space 

While classical probability has been remarkably successful in modeling various 
physical systems, its applicability to a specific situation is not a priori guaran- 
teed. One needs to provide physical arguments why probability theory can model 
the outcomes of a specific experiment. These arguments involve an explanation 
of the choice of the variables that define the sample space and a justification 
why the different runs of the experiment define a proper statistical ensemble. 

The relation of probabilities to event frequencies suggests that the experi- 
ment can be repeated a large number of times with the same preparation, or 
at least in such a way that variations in the preparation procedure affect little 
the results of the experiment. If this condition cannot be satisfied, then we 
cannot talk about a statistical ensemble and have no reason to expect that the 
measured frequencies would in any way allow us to determine meaningful em- 
pirical probabilities. Small variations in the preparation and execution of the 
experiment arc not so much a problem: there is always a sampling error and 
intrinsic uncertainty in the determination of any experiment and we shall see 
in the next section how this can be taken into account by the consideration of 
unsharp measurements. If, however, the uncertainties in an experiment are too 
large, there is little hope of extracting meaningful probabilistic information from 
it. In other words, the use of probability theory in modeling a physical system 
requires a condition of stability of the sample space, i.e. the assumption that 
the relation of the mathematical space of physical alternatives to the experi- 
mental outcomes remains the same in all elements of the statistical ensemble. 
In classical physics at least, the above condition is a necessary requirement for 
any meaningful experiment-if it is not satisfied then one usually asserts that 
the corresponding experiment is ill-designed. 

A key feature of the probabilities for sequential measurements we derived in 
section 3 is the strong dependence on the parameter 6 that quantifies the fuzzi- 
ness of single-time measurements. In classical probability the fuzziness includes 
contributions of very different physical origins: sampling and systematic error, 
specific features of the measurement device and the effect of uncontrollable pa- 
rameters, whose effect cannot be reproduced identically in all measurements of 
the statistical ensemble. It is usually unnecessary to distinguish between the dif- 
ferent contributions: 5 may be taken as an upper limit of all possible sources of 
error, as it does not affect the probabilities of sufficiently coarse samplings. In se- 
quential measurements this is no longer the case. If S is considered as a measure 
of the uncontrollable parameters in the system, the sensitivity of probabilities 
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on its value, implies that the probability density relevant to each different run of 
the experiment will be substantially different from each other 11 . It is difficult 
to see, how a statistical ensemble of reproducible experiments is meaningful, if 
their outcomes depend so strongly on the values of uncontrollable parameters. 
This suggests strongly that the sample space for sequential measurements is 
not stable. One therefore may question whether empirical probabilities can be 
constructed in that case. It is quite likely on operational grounds alone, that 
the frequencies obtained do not exhibit the needed convergence properties to 
define genuine probabilities. 

5.3 Non-convergent frequencies 
5.3.1 The case of classical probability 

While the relation of probabilities to event frequencies is the basic principle in 
any statistical manipulation of data, its application is not straightforward. All 
statistical samples are obtained from a finite number of experimental runs, while 
the probabilities are defined from event frequencies in the limit that the number 
of runs goes to infinity. This has been traditionally a very strong argument 
against the definition of probabilities through frequencies. However, for practical 
purposes it suffices that we consider a sufficiently large ensemble so that the 
frequencies seem to stabilize. The central limit theorem guarantees that if the 
description of a system by classical probability theory is valid, then the error in 
the determination of probabilities after n runs will fall like n -1 / 2 for sufficiently 
large n. 

More relevant to the present discussion is the behavior of relative frequencies 
in unsharp measurements, namely when there is an error of S (sampling error or 
effect of uncontrollable parameters) in the specification of probabilities. In that 
case the sequence v n {U) of event frequencies cannot be expected to stabilize to 
a probability even after a large number of runs, if the coarse-graining scale L of 
U is of the order of magnitude of 5: sampling is simply unreliable at this scale. 
There will be a region of convergence: no matter how many experimental runs 
we take into consideration the sequence will take values in a region of finite size. 

We may define a quantitative measure for the failure of a sequence to con- 
verge to a specific value. If a sequence v n does not converge then for n,m > N, 
where N may be a large integer, we cannot find a number e, such that v n — v m \ < 
e. This suggests defining the degree of non-convergence of v n as the limit 

e[v n } = lim sup \v n -v m \. (5. 1) 

If v n is a sequence of relative frequencies it is easy to verify that e[v n ] < 1 and 
that for a converging sequence e[i/ n ] = 0. Since in practice a sequence never 

11 If S is an upper limit for the effect of uncontrollable parameters, different sub-ensembles 
may be characterized by different values of <5 and hence different probability assignments. 
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converges, we need to establish a rather more heuristic criterion: we say that v n 
converges to a probability p, if the parameter e is much smaller than the value 
of p. In that case it defines the size of error (or ambiguity) in the determination 
of p. 

We next examine how the ratio of convergence for a sequence of events 
is related to the fuzziness of a measurement scheme. Let us denote by v n 
sequence of relative frequencies constructed from the experimental data (hence 
being inaccurate due to sampling errors), and by v n the ideal frequency that 
converges to some probability p, we see that 

\y n -p\ < K - y n \ + K -p\- (5.2) 

The second term falls with n -1 / 2 for large n, since it is assumed to converge 
ideally. The first term in the right-hand-side converges for large n to 

J dxp(x)\x 5 u(x) - X u(x)\, (5.3) 

for some smeared characteristic function for U that takes into account the effect 
of sampling errors. We essentially assume that D n = i Y^i=\ Xu( x n)> hence 
the ambiguity in the sampling of x n is transferred in the smeared characteristic 
function. Hence, we have 

\v n -p\<c— n—fco, (5-4) 
Li 

or for probability distributions with spread much larger than 5 we have the more 
stringent estimation (see Appendix A) 

\v n -p\ < c^-p (5. 5) 

5.4 Quantitative estimation 

In this section we explore the theoretical possibility that the lack of a non- 
contextual probability measure for multi-time histories is indicative of a failure 
of the event frequencies to stabilize after a large number of runs. To elaborate 
on this proposal we first need to guarantee that this assumption is compatible 
with the single-time predictions of quantum theory. This follows trivially from 
the fact that (2. 15) is additive for single-time measurements. The same holds 
for multi-time measurements, for which the projectors Pjj t commute with the 
Hamiltonian. 

We would expect the lack of convergence appear in all n-time measurements, 
for which (2. 15) is genuinely non-additive. The lack of additivity is quantified 
by the object (2. 17), namely the decoherence functional in the consistent his- 
tories approach. The decoherence functional should be a measure of the degree 
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of non-convergence for histories. There is another argument that lends plausi- 
bility to this expectation. The absolute values of the off-diagonal elements of 
the decoherence functional often become very small, when the selected histo- 
ries are sufficiently coarse-grained. It then becomes a good approximation to 
assign probabilities to such histories. This situation can be compared with the 
behavior of relative frequencies under coarse-graining. If our sampling is of the 
order of the measurement error S, frequencies do not stabilize to probabilities. 
If we coarse-grain sufficiently however, so that the sample set is of size L » 5, 
the relative error falls like £ and reasonable probabilities can be approximately 
defined. 

The analogy above is only mathematical, as the postulated lack of con- 
vergence in quantum theory cannot be explained away as measurement error. 
It strongly argues however that the decoherence functional should encode the 
information of the frequency's non-convergence. In effect, one may assign a 
probability for a specific sampling Ui, U2, ■ ■ ■ , U n , if the decoherence functional 
between the corresponding history a and its negation jtx. (corresponding to the 
subset 17" — U\ x U2 x . . .x U n ) is much smaller in magnitude than the probability 
associated to U\, U2, ■ ■ ■ , U n , or in other words if 

2Red(a, jtx) « d(a,a), (5-6) 

This suggests that the proper measure for the relative rate of convergence 
e[v n ]/p should be identified with the ratio R< ^*ff ■ 

We need to comment at this point on the difference of the present proposal 
from the consistent histories approach. The first difference lies in the context. 
The consistent histories approach describes individual systems, without mak- 
ing special reference to measurement outcomes. Here we are only interested in 
empirical probabilities, as determined by measurements. The second difference 
is more important: the consistent histories approach places no interpretation on 
the decoherence functional or the objects (2. 15), unless the former vanishes, 
in which case (2. 15) define a genuine probability measure. Here, in our search 
to preserve the quantum logic at the level of measurements, we need to find an 
interpretation of the mathematical object (2. 17) in terms of observable quan- 
tities, namely the statistical behavior of the sequence of relative frequencies v n . 
Hence even if there are many structural similarities between the present hypoth- 
esis and consistent histories both the context and the physical interpretation of 
the mathematical objects is conceptually distinct. 

5.5 Experimental distinction 

We mentioned two motivations for the hypothesis of non-converging frequencies 
(a third more tentative one arising from the study of frequency operators [46, 47] 
can be found in Appendix C). The main one was the preservation of the quantum 
logic structure of sequential measurements. The decoherence functional for a 
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history depends on the sample sets only through the projectors. Hence the 
statistical behavior of the relative frequencies that it incorporates would be 
the same (modulo sampling and systematic errors) in all different experiments 
that measure multi-time probabilities. It would not depend, in particular, on 
whether the experimental set-up is that of a sequential YES-NO experiment or 
of a sequential measurement of position. 

If this is true then it is very easy to distinguish the predictions of this hy- 
pothesis from that of standard quantum theory. If the quantum logic struc- 
ture is preserved, probabilities for multi-time samplings are not definable, but 
they are for single-time measurements. In particular, the single-time marginals 
in a multi-time measurement should always coincide with those of single-time 
quantum theory. In a two-time measurement of an observable A = J^ajPi, 
the frequencies v n {Ui, 0; U2, t) do not converge for generic sample sets U\ and 
U2, but the coarse-grained frequencies 0; U2, t) should correspond to the 

single-time probabilities J2ieu 2 ^ r {fiQi) > wnere Qi — e l ^ t Pie~ l ^ t . On the 
other hand standard quantum theory predicts for the same probabilities that 

p(Cl, 0; U 2 , t)=Y.Y. Tr (p p iQi P ) ■ (5- 7 ) 

jeu 2 i 

The results are clearly different. For successive Stern- Gerlach measurements, 
the first in the x and the second in the z direction of spin, the former hypothesis 
yields a probability density = Tr{pP z ) with P t z the spectral projectors of spin, 
while standard quantum theory yields a constant probability density pi = |. 
Since the distinction arises at the level of the marginals, it is not necessary 
to perform experiments with individual, distinguishable runs, but it suffices to 
employ particle beams. 

Note that the local hidden variable theories of the type considered in section 
4.3 also satisfy equation (5. 7), since their measurement outcomes are described 
effectively by a stochastic process by virtue of equations (4. 10, B. 8). 

"Welcher-Weg experiments. One may contend that the behavior above 
can be excluded on the basis of the so-called "welcher Weg experiments", in 
which a detector placed immediately behind the holes of a two slit experiments 
destroy the interference pattern. One may consider for example the treatment 
of [50], which involves the interference of two neutron beams. One places a mi- 
cromaser cavity in the course of each beam. The photons in the cavity interact 
with the neutrons' spin degrees of freedom. If the field in the cavities is pre- 
pared in a number state, the interaction reveals unambiguously that the neutron 
passed through one cavity or the other, and hence provides information about 
the neutron's position. As a result the interference pattern observed in a screen 
behind the detectors is destroyed. Clearly, the intermediate measurement leads 
to different results for the probability distribution on the screen and a violation 
of (5. 7). 
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There is however a flaw in this argument. Within standard quantum theory, 
any measurement involves the separation between a quantum system and a 
classical apparatus. This is not an easy distinction to make, but it is necessary 
in order to specify the level at which the reduction procedure is implemented. 
In the experiment discussed above the electromagnetic field in the cavity is 
described by quantum mechanics. In the assumed splitting between quantum 
and classical, it falls within the quantum domain. Hence, the quantum system 
in consideration is not the neutron, but the combined system of neutron and 
electromagnetic field, which interact non-trivially through the spin degrees of 
freedom. However, in absence of the micromaser the quantum system is only the 
neutron. It is therefore not possible to verify a violation of (5. 7) by comparing 
the probabilities in these different experiments. Indeed, if the total system of 
electromagnetic field and neutron is treated as quantum mechanical, the loss of 
interference is expected whether or not the photon number has been measured in 
the microcavity. The distribution of particle positions is obtained by the reduced 
density matrix of the neutron interacting with the electromagnetic field. This 
is naturally expected to exhibit a loss of coherence, arising solely from unitary 
evolution of the total system. To test equation (5. 7) one would have to compare 
the probability distribution of neutron positions between an experiment that 
includes a measurement of the photons in the cavity and one that does not. 

In any attempt to verify the validity of (5. 7) one has to compare experimen- 
tal set-ups for which the split between quantum and classical occurs at the same 
level. This is the case of the two-time position measurement sketched in Fig. 3. 
In that case one also has to take into account all possible sources of error. In 
the original Bohr-Einstein debate that led to the formulation of the complemen- 
tarity principle, the demonstration that "which- way" information destroys the 
interference patter came essentially from classical arguments about the inherent 
limitation in the precision of the first measurement [51]. Quantum theory was 
only introduced, in order to place an upper limit in the measuring accuracy. 
Bohr' argument is therefore very different in character than that of [50]. It 
states essentially that the uncertainty at the level of the classical device affects 
the quantum phases randomly and thus leads to a destruction of the interference 
pattern. 

The same argument can be invoked in the present context in relation to the 
assumption that measurements take place at a sharply specified moment of time. 
This is not the case in a realistic experiment. One may of course incorporate this 
uncertainty in the error 5 of an unsharp measurement. However, in multi-time 
measurements there appears an additional source of randomness: the presence 
of the first measuring device makes it more difficult to specify the moment of 
time, at which the second measurement takes place. 

To see this, we may consider for example the thought-experiment of Fig. 3. 
The detection of a particle can be assumed to take place at a specific moment 
of time (the same for all runs of the experiment) if the particle's momentum is 
sharply defined: p z is essentially treated as a classical variable. In a two-time 
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measurement however the value of p z changes randomly after the interaction 
with the first measurement device, say by an amount of 8p z = g? -1 , for some 
parameter d with the units of length. This implies a fuzzincss in the time that 
the particle arrives in the second detector of the order of -^-g , hence an additional 
uncertainty in the specification of the particle's position of an order of The 
uncertainty d can be expected to be of the order of 5, hence the uncertainty is 
of the order of magnitude of 

To see that the randomness induced by the uncertainty in the specification 
of the measurement time is by itself responsible for the loss of the interference 
patter, we consider the marginal probability density induced by the POVM (3. 
39). For an initial state corresponding to the set-up of a two slit experiment 



tp (x) 



V2(27T ( J 2 ) 1 /4 



(x-L/2) 2 (x + L/2) 2 



(5. 8) 



where a is the width of the slit, L the distance between the slits and the mean 
momentum in the ^-direction is for simplicity set equal to zero 12 we obtain 



p(fi,<W) 



m 



*(7 + P) 



g 4( 7 +/3) _|_ g 4(-y + /3) 



+2e ^ » J H~<+i3)> e cos 



to 2 (/? + 7 ) 



(5. 9) 



where (3 = + 2m t i S and 7 = \ + m t -f . Hence, even the probability con- 
structed from the consideration of a two-time measurement in standard quantum 
mechanics exhibits terms that describe a distinguishable interference pattern. 
The interference pattern is washed out only when an additional consideration of 
the error due to the time uncertainty is taken into account. The period of the 
interference pattern is c^, where c is a constant of order unity. The fuzziness 
due to the uncertainty relation is of the order of -^r. For the two-slit experi- 
ment to make sense the distance between the slits has to be much larger than 
the resolution of the measurement device-hence L >> 5. It follows that the 
interference pattern is hidden beneath the effect of the time uncertainty. 

It is important to stress the procedure we followed to obtain the result above. 
Following Bohr's argumentation, the proof that the interference patter disap- 
pears does not arise from the operational rules of quantum theory about se- 
quential measurements, but by physical considerations that apply to our inher- 
ent inability to establish with arbitrary precision the time a measurement takes 
place. This effect cannot be obtained through formal manipulations in standard 
quantum theory: at an operational level the formalism allows one to talk only 
about measurements at sharp moment of time 13 , while if we attempt to treat 

12 Note the x direction is transverse to the particle's direction of motion. 
13 This problem is related to issue of constructing time-of-arrival probabilities in quantum 
theory, and is explored further in [52]. 
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the measuring apparatus as fully quantum mechanical we come face to face with 
the measurement problem (is the reduction of the wave packet instantaneous, if 
a physical process at all?). 

The argument that proves that interferences are washed out in a two-time 
position measurement works the same way in standard quantum theory and 
within the non-converging frequency hypothesis. In the latter case one also 
obtains an interference pattern with period of the order which is hidden 
beneath the effects of the randomness due to time uncertainty. Even though 
the right-hand-side is different from (5. 9) the form of the terms is the same 
and the difference only lies in the exact value of the coefficients. Since this 
difference is drowned from the effects of the time uncertainty, we conclude that 
it is very difficult (if not impossible) to distinguish the predictions of quantum 
theory from those of the non-converging frequency hypothesis (or of local hidden 
variable theories for that matter) by means of equation (5. 7). 

Hence the only way to distinguish between those theories would be by di- 
rectly measuring the frequencies v n {Ui,ti; U2, t 2 ) and trying to establish whether 
they converge or not. In the Appendix B we prove that this is in principle fea- 
sible, i.e. the suggested failure of the frequencies to converge is much stronger 
than any sources of error (such as the time uncertainty considered earlier), and 
for this reason it is in principle detectable. 

6 Conclusions 

We conclude with a brief summary of the paper's results. 

We studied the issue of constructing probabilities for sequential measure- 
ments. In Section 3 we demonstrated that these probabilities are highly con- 
textual, namely they depend very strong on seemingly trivial details of the 
apparatus (the parameter that determine its resolution). We noted that this is 
a case of contextuality that does not involve counterfactual reasoning: it may 
be determined in a direct measurement set-up. 

A key step for our analysis is the proof of a general theorem that there is 
no way to reproduce the probability distribution for the results of quantum 
theory from a stochastic process for the measured system's degrees of freedom. 
We elaborated on this point in section 4, where we demonstrated that hidden 
variable theories can reproduce the predictions of standard quantum theory only 
if they include non-local interactions. 

Finally, in section 5 we explored a rather unconventional alternative that 
could allow the preservation of quantum logic in sequential measurements: that 
probabilities are not defined, because the corresponding frequencies do not con- 
verge. We demonstrated that the predictions of this proposal can be unambigu- 
ously distinguished from those of quantum theory. 
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A Unsharp measurements and smeared charac- 
teristic functions 

In any measurement there are systematic errors, uncertainties in the specifi- 
cation of the initial state or the preparation of the apparatus, fuzziness in the 
sampling of the results, dependence on uncontrollable properties of the mea- 
surement device etc. For this purpose it is necessary to consider the description 
of unsharp or fuzzy measurements. 

In unsharp measurements there will be outcomes for which we will not be 
able to state unambiguously that "the system was found in the subset U off 
the sample space 0" or its negation. Such assertions can be made only with a 
degree of confidence characterized by the relative size 5 of the measuring un- 
certainty. This can be implemented by substituting the characteristic functions 
that represent the propositions about the measurement outcomes with smeared 
characteristic functions Xu> which differ from true characteristic functions on 
the scale of 6. Given the fact that a characteristic function for Q = R is written 
as 

X u(x) = f dx'S(x-x'), (A. 1) 

Ju 

a smeared characteristic function may be written as 

X 5 u(x) = [ dx'f 5 (x-x'), (A. 2) 

Ju 

where fg is an one-parameter family of smooth functions converging weakly to 
the delta function as 5 — > 0. Any real-valued function that falls to zero rapidly 
outside U, takes values close to unity well inside U and interpolates continuously 
between 1 and in a region of size <5 around the boundary of U is an adequate 
smeared characteristic function. We may use for example a Gaussian family 

If the size of the sample set U is L, then the difference of the smeared 
characteristic function from a true characteristic function is of the order of 0(5. 
This difference may be quantified by a norm in the space of functions-usually 
the L 1 norm- of the difference Xu — Xu- Indeed, it is easy to estimate that for 
the Gaussian smearing function (A. 3) 
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dx\ X u ~ xti\ < cS, (A. 4) 



where c is a positive number of order unity. This equation can also be employed 
as a definition of a smeared characteristic function. This implies that the relative 
difference between the smeared and the genuine characteristic function is 

e = S dx \xu-X* u \ <c S 
Iu dx L 

It is not possible to obtain an equation analogous to (A. 4) for the difference 
Xu — Xu weighted by the probability. However, if we define the margin M of the 
sample set U as the region of £1 in which \xu — Xu\ * s appreciably larger than 
zero (say larger than a fixed small number r « I), we may estimate that 

P(x)\xu - Xu\ < I dxp(x) + O(r), (A. 6) 

J M 

hence for suitable choice of r we can always find a constant c of the order of 
unity such that 

/ P(x)\xv ~Xu\<c / dxp(x). (A. 7) 

For general p we cannot improve the above inequality. The physically interesting 
case is one for which p varies at a scale much larger than the size 5 of the margin, 
otherwise any probabilistic information would be completely lost beneath the 
sampling error. In that case one may estimate that 



/ 



p(x)\Xu-xU<c'tP(U) (A. 8) 



L 

For the most general case the following estimation is relevant 



/ 



p(x)\Xu-X S u\<c'^p(U), (A. 9) 



where R is the size of the area of support of p(x)x)U(x). 

B Generalization of the results of section 4.2 

Unsharp measurements. The result (4. 10) can be reproduced even for 
unsharp measurements of a continuous pointer function X. In that case one 
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substitutes equations (4. 7) and (4. 8) with 

X[Q t (x , Qo)} = f(x t (x , Qo)) + 0(5) (B. 1) 

p(U,t) = J dxo,dQo\4>o\ 2 M\^\ 2 (Qo)xu[Xt(x ,Qo)} = 

J dx \^ \ 2 (x )x 5 ul4 yS (xo)} = (M^He-^^)), (B. 2) 

where now F s is a POVM for the variable / characterized by a parameter S, 
which incorporates the effects of the interaction with the measurement device. 
The proof follows the same steps as the discrete variable case, the only difference 
being that we substitute the characteristic functions with smeared ones. Since 
for a family of characteristic functions x S labeled by S 

x s Ul x s u 2 ^x s (u 1 nu 2 ) + o(S), (B. 3) 

it is easy to conclude that if the locality condition (4. 9) the probabilities 
p 2 coincide up to an error of order 0(6) with those obtained by a stochastic 
process constructed from the degrees of freedom of the system by itself. Hence 
for samplings of size much larger than 5 the probabilities do not depend on 
properties of the measurement device, something that contrasts the results of 
standard quantum theory like Eq. (3. 32). 



Markov process. The same arguments may be applied for non-deterministic 
hidden variable theories that reproduce the predictions of quantum theory through 
a Markov process. We denote by g(x, Q, t\x' , Q' 7 t') the propagator correspond- 
ing to the interacting dynamics between system and apparatus, by h(Q, t\Q' , t') 
the one corresponding to the self-dynamics of the apparatus and by g sys (x, t\x' , t') 
the one corresponding to the self-dynamics of the system in absence of appara- 
tus. 

Assuming an initially factorized state the conditions that the stochastic pro- 
cess reproduces the operational predictions for single-time quantum theory for 
a discrete pointer X are 

X(Qt) = f(x t ), (B. 4) 

p(U,t) = J dx dQodxtdQt\ipo\ 2 (xo)\(t>\ 2 (Qo) 

x g(x o ,Q o ,0\x t ,Qt,t)xu[X{Qt))] 

= f dx dx t \i; \ 2 (xo)g sys (x ,0\x t ,t)xu(f(xt t )) (B. 5) 

where t denotes the time that the measurement has been completed. 

In a two-time measurement the total propagator for the degrees of freedom of 
the measured system and the two measuring devices factorized as g(xo, Qo,Q\x\,Q\ 
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xg(Q 2 , 0\Q 2 , t) for all times prior to the first measurement, as the two devices 
are assumed non-interacting. The key assumption, analogue to the locality pos- 
tulate (4. 9) in deterministic systems is that the propagator for the total system 
factorized after the first measurement as 

g(x t ,Qlt\xt',Q 2 t ,,t')h(Qt,t\Qt<,t'), t,t'>h, (B. 6) 

which states that the particle's stochastic evolution is not affected by the degrees 
of freedom of the first apparatus after the interaction has been completed. 

With the assumptions above, it is easy to show following steps analogous to 
those of (4. 10) that the two-time probability 

p{U t , h; U 2 , t 2 ) = j dx dQl dQl dx tl dQ\ x dQ 2 ti dx t2 dQ\ 2 dQ\ 

x |^o| 2 (^o) \<i>i\ 2 {Ql) \<p2\ 2 (Ql)g(^Ql 0|z tl , £&,*!) h(Ql 0\Q 2 tl ,h) 

x XUl [X tl ]g(.*ti > Ql > ti\x t2 , Q 2 t2 , h) h(Ql , h\Ql , t 2 ) X u 2 [X ta ] (B. 7) 

equals 

p(Ui,h; U 2 ,t 2 ) = j dx dx tl dx t2 |-0 o | 2 (a:o) g sys (x , 0\x tl , h) 

xxuAft,)g sys (xt,MxtMxu 2 Ut.)- (b. 8) 

Hence probabilities are again described by a stochastic processes for the mea- 
sured system's degrees of freedom, in conflict with the predictions of quantum 
theory. To reproduce the predictions of quantum theory with a Markov process, 
one would need to assume a violation of Eq. (4. 9). 



C Frequency operators 

In some interpretations of quantum theory, it is often asserted that the Born's 
rule from probabilities can be obtained from a weaker postulate, namely that if 
an observable A is measured on a system in one of its eigenstates, the outcome 
is the corresponding eigenstate. The idea is to construct a Hilbert space for 
the statistical ensemble H ens as (ideally an infinite) tensor product ®nTL n of 
the Hilbert space H of the single system [46, 47] (see also [48] and references 
therein). Assuming a projection operator P corresponding to a state |i) of H, we 
may construct a PVM corresponding to the different values of the frequencies 
/ for the event corresponding to P in the statistical ensemble. For a finite 
ensemble consisting of N copies this PVM reads 

fl p (f = n/N)= P kl ®...®Pk n , (C. 1) 

kl + k 2 + ...+K n =n 

where h — 1 corresponds to P^ — P and ki = 1 corresponds to P^ =1 — P. 
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One then may (attempt to) prove that the vectors ® n \ij}n are eigenstates of 
the frequency operators Fp — J2n=o W^-pif = n /N) at the limit N — > oo with 
eigenvalues coinciding to the standard probabilities |(i|-0)| 2 . 

The underlying concept in this approach is that the Born rule may be derived 
solely from the projective geometry of the Hilbert space by making reference 
to the ensemble as an individual quantum system. This approach faces some 
problems in its mathematical implementation [49], but it is interesting to see 
whether it can be applied to sequential measurements. 

The key obstacle is that one cannot assign projection operators correspond- 
ing to the outcomes of sequential measurements. Even for ideal measurements 
the best we can do is to construct a POVM like (3. 17), in which two succes- 
sive readings i and j of the variable x will correspond to the positive operators 
Kij = Pie iAt Pje- iAt Pi. The analogue of the PVM (C. 1) would therefore be a 
POVM n^, in which would be inserted in place of the projector P. It is 
easy to verify that the failure of the idempotency condition Kfj = implies 
that 

fL k Uij - n/N)^^ = n'/N) ± if n £ ri , (C. 2) 

and that this property persists even at the limit TV — > oo. These POVMs 
cannot distinguish between different values of the frequency in the ensemble. 
It is, therefore, not possible to obtain probabilities solely from the geometry of 
Hilbert space, because the positive operators Tlp(f = n/N) cannot be associated 
with specific values of frequency in a way that respects the projective character 
of Hilbert space geometry. 

Alternatively one could define a POVM corresponding to frequencies /i = 
ni/N for the outcome i of the first measurement and f 2 = n 2 /N for the second 

U(f 1 =n 1 /N,0;f 2 = n 2 /N,t) = 

ft(/i = ni/JV)[®„e idt ]n(/ 2 = n 2 /N)[® n e- ltlt ]fi(h = m/N). (C. 3) 

This POVM is different from the one obtained from the modification of (C. 
1), because different fine-grained alternatives are used in its construction. It is 
still subject to Eq. (C. 2) and cannot distinguish between different values of 
frequency. 

The results above imply that the programme of defining probabilities through 
frequencies (without a priori assuming Born's rule) cannot account for sequen- 
tial measurements. The only way to salvage it, is to take Eq. (C. 2) at face 
value and assume that different values of the frequency cannot be distinguished 
in sequential measurements, implying in effect that multi-time probabilities are 
ill-defined-or they do not converge. 
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D Distinguishability of non- converging frequen- 
cies 



We consider a two-time measurement of position as described in section 3.2. 
We assume that a beam of free particles with mass m is prepared in a state ?/>o, 
centered around x = 0, with zero mean momentum p x and a spread L in the x 
direction. We consider for simplicity only two samplings at each moment of time, 
corresponding to the sets U+ = [0, oo) and U- = (— oo, 0]. The corresponding 
projectors are P + and P_. We then consider the candidate probability that the 
particle is detected in U + at time t\ and in U2 at time t 2 , 

p ++ = {MP+e^P+e-^P+^o), (D. 1) 

while the obstruction to additivity equals 

b = 2Re(iPo\P+e t " t P+e- l " t P-\^ ) . (D. 2) 



2L- 



particle 



U 4 



U- 



history (U+ , U+) 



history {U+,U-)' 



U 4 



U- 



Figure 4: The particles pass through a slit of width 2L and are registered in two successive 



The details of the wave- function's shape do not significantly affect the result. 
For calculational convenience we consider 



1 



X[-L,L](x), 



(D. 3) 



where X[-l,l] IS the characteristic function of the set [-L, L] - corresponding 
for instance to a slit of width 2L placed immediately before the first detector. 
We then obtain 



1 f 1 

p ++ = — dzSi[z 2 /r] 
n Jo 
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Figure 5; A plot of the ratio 6/p++ versus dimensionless time r — — for the system of Fig. 
2. This ratio measures the failure of additivity in the natural probability assignment and estimates 
the relative size of the area of non-convergence for event frequencies. 



t ^i^>> (D . 4) 

where r = the dimensionless time-scale; Si stands for the sin- integral 

function. In Fig. 6 we plot the ratio b/p ++ as a function of r: it starts from 
at r = (in which case we have a single-time measurement), it increases rapidly, 
and for r ~ 1 reaches the asymptotic value 1/2. In other words, the assumed 
non-convergence of probabilities is manifested very strongly even for the highly 
coarse-grained sample sets U+ and U- . 

In any statistical sample there exists an extended region of convergence for 
relative frequencies due to sampling errors or systematic uncertainties, or due 
to the finite number of experimental runs. These errors can be accounted for by 
positive operators that approximate projectors within an order of the error. If d 
is the size of the error, we need to employ operators of the form J Xu(x)\x)(x\, 
defined by the smeared characteristic functions xu- The related error equals 
\Trp{Pu — Py)\ ~ j^Tr(pPu), where R is the size of the support of p(x, x)xu(x)- 
For the configuration of Fig. 4, R <~ L and d is the width of the particle's trace, 
so the error is at most of the order of d/L. 

The operators describing the sampling of measurement outcomes have also 
to incorporate the indeterminacy in measurement time-see the discussion in 
section V.5 . One should therefore employ a time- averaged projector, on an 
interval of width r around the moment of time t. This corresponds to the 
proposition that the measurement took place at any time within the interval 
[t-r/2,t + T/2], 

1 ft+r/2 

Uu = - dse lHs Pue- tHs . (D. 5) 

T Jt-r/2 

To leading order in r, the spread of tin is e = The indeterminacy in time 
t is related to the interaction of the system with the first sheet and should be 
of the order ^-t, where Sp z is the momentum transfer in the z direction as the 
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particle crossed the sheet. Hence we estimate that e ~ m p'd ^' ^^ LC uncertainty 
relation suggests that 5p z is of the order of 1/d, so e <~ $ - dl , where v z is the 
mean velocity of the particles in the z direction. Since the non-additivity of 
probabilities is manifested for r ~ 1, we may substitute the corresponding value 
of t to obtain e ~ Jj . Hence the total uncertainty in the measurement is of 

the order of c\ jr +c 2 mf ^ t) , with c\ , c 2 numbers of order unity. For realistic values 
of L = lcm, d = 10~ 2 cm, w z = 10 4 m/s, the error due to time indeterminacy 
is of the order of 10~ 4 , much smaller than the ratio d/L <~ of relative 

error in position sampling. It follows that the non-convergence of probabilities- 
if present- is more pronounced than the measurement uncertainties and can be 
in principle detected. 
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